Re: [gimple-classes, committed 4/6] tree-ssa-tail-merge.c: Use gassign

2014-11-19 Thread Richard Biener
On Tue, Nov 18, 2014 at 10:04 PM, David Malcolm  wrote:
> On Tue, 2014-11-18 at 10:53 +0100, Richard Biener wrote:
>> On Tue, Nov 18, 2014 at 2:59 AM, David Malcolm  wrote:
>> > On Mon, 2014-11-17 at 11:06 +0100, Richard Biener wrote:
>> >> On Sat, Nov 15, 2014 at 12:00 PM, David Malcolm  
>> >> wrote:
>> >> > On Thu, 2014-11-13 at 11:45 +0100, Richard Biener wrote:
>> >> >> On Thu, Nov 13, 2014 at 2:41 AM, David Malcolm  
>> >> >> wrote:
>> >> >> > On Tue, 2014-11-11 at 11:43 +0100, Richard Biener wrote:
>> >> >> >> On Tue, Nov 11, 2014 at 8:26 AM, Jakub Jelinek  
>> >> >> >> wrote:
>> >> >> >> > On Mon, Nov 10, 2014 at 05:27:50PM -0500, David Malcolm wrote:
>> >> >> >> >> On Sat, 2014-11-08 at 14:56 +0100, Jakub Jelinek wrote:
>> >> >> >> >> > On Sat, Nov 08, 2014 at 01:07:28PM +0100, Richard Biener wrote:
>> >> >> >> >> > > To be constructive here - the above case is from within a
>> >> >> >> >> > > GIMPLE_ASSIGN case label
>> >> >> >> >> > > and thus I'd have expected
>> >> >> >> >> > >
>> >> >> >> >> > > case GIMPLE_ASSIGN:
>> >> >> >> >> > >   {
>> >> >> >> >> > > gassign *a1 = as_a  (s1);
>> >> >> >> >> > > gassign *a2 = as_a  (s2);
>> >> >> >> >> > >   lhs1 = gimple_assign_lhs (a1);
>> >> >> >> >> > >   lhs2 = gimple_assign_lhs (a2);
>> >> >> >> >> > >   if (TREE_CODE (lhs1) != SSA_NAME
>> >> >> >> >> > >   && TREE_CODE (lhs2) != SSA_NAME)
>> >> >> >> >> > > return (operand_equal_p (lhs1, lhs2, 0)
>> >> >> >> >> > > && gimple_operand_equal_value_p 
>> >> >> >> >> > > (gimple_assign_rhs1 (a1),
>> >> >> >> >> > >  
>> >> >> >> >> > > gimple_assign_rhs1 (a2)));
>> >> >> >> >> > >   else if (TREE_CODE (lhs1) == SSA_NAME
>> >> >> >> >> > >&& TREE_CODE (lhs2) == SSA_NAME)
>> >> >> >> >> > > return vn_valueize (lhs1) == vn_valueize (lhs2);
>> >> >> >> >> > >   return false;
>> >> >> >> >> > >   }
>> >> >> >> >> > >
>> >> >> >> >> > > instead.  That's the kind of changes I have expected and 
>> >> >> >> >> > > have approved of.
>> >> >> >> >> >
>> >> >> >> >> > But even that looks like just adding extra work for all 
>> >> >> >> >> > developers, with no
>> >> >> >> >> > gain.  You only have to add extra code and extra temporaries, 
>> >> >> >> >> > in switches
>> >> >> >> >> > typically also have to add {} because of the temporaries and 
>> >> >> >> >> > thus extra
>> >> >> >> >> > indentation level, and it doesn't simplify anything in the 
>> >> >> >> >> > code.
>> >> >> >> >>
>> >> >> >> >> The branch attempts to use the C++ typesystem to capture 
>> >> >> >> >> information
>> >> >> >> >> about the kinds of gimple statement we expect, both:
>> >> >> >> >>   (A) so that the compiler can detect type errors, and
>> >> >> >> >>   (B) as a comprehension aid to the human reader of the code
>> >> >> >> >>
>> >> >> >> >> The ideal here is when function params and struct field can be
>> >> >> >> >> strengthened from "gimple" to a subclass ptr.  This captures the
>> >> >> >> >> knowledge that every use of a function or within a struct has a 
>> >> >> >> >> given
>> >> >> >> >> gimple code.
>> >> >> >> >
>> >> >> >> > I just don't like all the as_a/is_a stuff enforced everywhere,
>> >> >> >> > it means more typing, more temporaries, more indentation.
>> >> >> >> > So, as I view it, instead of the checks being done cheaply (yes, 
>> >> >> >> > I think
>> >> >> >> > the gimple checking as we have right now is very cheap) under the
>> >> >> >> > hood by the accessors (gimple_assign_{lhs,rhs1} etc.), those 
>> >> >> >> > changes
>> >> >> >> > put the burden on the developers, who has to check that manually 
>> >> >> >> > through
>> >> >> >> > the as_a/is_a stuff everywhere, more typing and uglier syntax.
>> >> >> >> > I just don't see that as a step forward, instead a huge step 
>> >> >> >> > backwards.
>> >> >> >> > But perhaps I'm alone with this.
>> >> >> >> > Can you e.g. compare the size of - lines in your patchset 
>> >> >> >> > combined, and
>> >> >> >> > size of + lines in your patchset?  As in, if your changes lead to 
>> >> >> >> > less
>> >> >> >> > typing or more.
>> >> >> >>
>> >> >> >> I see two ways out here.  One is to add overloads to all the 
>> >> >> >> functions
>> >> >> >> taking the special types like
>> >> >> >>
>> >> >> >> tree
>> >> >> >> gimple_assign_rhs1 (gimple *);
>> >> >> >>
>> >> >> >> or simply add
>> >> >> >>
>> >> >> >> gassign *operator ()(gimple *g) { return as_a  (g); }
>> >> >> >>
>> >> >> >> into a gimple-compat.h header which you include in places that
>> >> >> >> are not converted "nicely".
>> >> >> >
>> >> >> > Thanks for the suggestions.
>> >> >> >
>> >> >> > Am I missing something, or is the gimple-compat.h idea above not 
>> >> >> > valid C
>> >> >> > ++?
>> >> >> >
>> >> >> > Note that "gimple" is still a typedef to
>> >> >> >   gimple_statement_base *
>> >> >> > (as noted before, the gimple -> gimple * change would break e

Re: gcc-4_9 inlines less funcs than gcc-4_8 because of used_as_abstract_origin flag.

2014-11-19 Thread Richard Biener
On Wed, Nov 19, 2014 at 8:21 AM, Richard Biener
 wrote:
> On November 19, 2014 8:13:09 AM CET, Wei Mi  wrote:
>>We see an inline problem as below caused by r201408
>>(https://gcc.gnu.org/ml/gcc-patches/2013-08/msg00027.html).
>>
>>hoo() {
>>  foo();
>>  ...
>>}
>>
>>foo {
>>  goo();
>>  ...
>>}
>>
>>foo is func splitted, so its body changes to
>>
>>foo {
>>  goo();
>>  ...
>>  foo.part();
>>}
>>
>>and the used_as_abstract_origin of cgraph node of foo will be set to
>>true after func splitting.
>>
>>In ipa-inline, when inlining foo into hoo, the original node of foo
>>will not be reused as clone node because used_as_abstract_origin of
>>cgraph node of foo is true and can_remove_node_now_p_1 will return
>>false, so that a new clone node of foo will be created. This is the
>>case in gcc-4_9.
>>In gcc-4_8, the original node of foo will be reused as clone node.
>>
>>gcc-4_8
>>foo
>>  |
>>goo
>>
>>gcc-4_9
>>foofoo_clone
>>\/
>>  goo
>>
>>Because of the difference of whether to create a new clone for foo,
>>when inlining goo to foo, the overall growth of inlining all callsites
>>of goo in gcc-4_8 will be less than gcc-4_9 (goo has two callsites in
>>gcc-4_9 but only one in gcc-4_8). If we have many cases like this,
>>gcc-4_8 will actually have more inline growth budget than gcc-4_9 and
>>will inline more aggressively than gcc-4_9.
>>
>>I don't understand the exact usage of the check about
>>node->used_as_abstract_origin in can_remove_node_now_p_1, but I feel
>>puzzled about following two points:
>>
>>1. https://gcc.gnu.org/ml/gcc-patches/2013-08/msg00027.html said the
>>patch was to ensure all abstract origin functions do have nodes
>>attached. However, even if the node of origin function is reused as a
>>clone node, a new clone node will be created in following code in
>>symbol_table::remove_unreachable_nodes if only the node that needs
>>abstract origin is reachable.
>>
>>  if (TREE_CODE (node->decl) == FUNCTION_DECL
>>  && DECL_ABSTRACT_ORIGIN (node->decl))
>>{
>>  struct cgraph_node *origin_node
>> = cgraph_node::get_create (DECL_ABSTRACT_ORIGIN (node->decl));
>>  origin_node->used_as_abstract_origin = true;
>>  enqueue_node (origin_node, &first, &reachable);
>>}
>>
>>2. DECL_ABSTRACT_ORIGIN(decl) seems only useful for debug info of
>>clone nodes. But now the check of used_as_abstract_origin affect
>>inline decisions, which should be the same with or without keeping
>>debug info.
>
> I think we need to keep the functions but do not need to account for them in 
> the unit size if we otherwise could remove them

Btw - please make sure to open a bug so this issue doesn't get lost.

Thanks,
Richard.

> Richard.
>
>>Thanks,
>>Wei.
>
>


Re: Maxim Kuvyrkov appointed Android sub-port reviewer

2014-11-19 Thread Maxim Kuvyrkov
On Nov 14, 2014, at 9:00 PM, H.J. Lu  wrote:

> On Sun, Apr 15, 2012 at 5:08 PM, David Edelsohn  wrote:
>>I am pleased to announce that the GCC Steering Committee has
>> appointed Maxim Kuvyrkov as reviewer for the Android sub-port.
>> 
>>Please join me in congratulating Maxim on his new role.
>> Maxim, please update your listing in the MAINTAINERS file.
>> 
>> Happy hacking!
>> David
>> 
> 
> Hi Maxim,
> 
> Have you added your name to MAINTAINERS?

Will do.

Thank you,

--
Maxim Kuvyrkov
www.linaro.org





Re: What is R_X86_64_GOTPLT64 used for?

2014-11-19 Thread H.J. Lu
On Mon, Nov 17, 2014 at 6:14 AM, Michael Matz  wrote:
> Hi,
>
> On Thu, 13 Nov 2014, H.J. Lu wrote:
>
>> Linker does:
>>
>> ... code that looks like it might create just one GOT slot ...
>>
>> So if  a symbol is accessed by both @GOT and @PLTOFF, its
>> needs_plt will be true and its got.plt entry will be used for
>> both @GOT and @GOTPLT.  @GOTPLT has no advantage
>> over @GOT, but potentially wastes a PLT entry.
>
> The above is not correct.  Had you tried you'd see this:
>
> % cat x.c
> extern void foo (void);
> void main (void)
> {
>   void (*f)(void) = foo;
>   f();
>   foo();
> }
> % gcc -fPIE -mcmodel=large -S x.c; cat x.s
> ...
> movabsq $foo@GOT, %rax
> ...
> movabsq $foo@PLTOFF, %rax
> ...
>
> So, foo is access via @GOT offset and @PLTOFF.  Then,
>
> % cat y.c
> void foo (void) {}
> % gcc -o liby.so -shared -fPIC y.c
> % gcc -fPIE -mcmodel=large x.s liby.so
> % readelf -r a.out
> ...
> 00600ff8  00040006 R_X86_64_GLOB_DAT  foo + 0
> ...
> 00601028  00040007 R_X86_64_JUMP_SLO  foo + 0
> ...
>
> The first one (to 600ff8) is the normal GOT slot, the second one the GOT
> slot for the PLT entry.  Both are actually used:
>
> 004005f0 :
>   4005f0:   ff 25 32 0a 20 00   jmpq   *0x200a32(%rip)# 
> 601028 <_GLOBAL_OFFSET_TABLE_+0x28>

They are not:

Starting program: /export/home/hjl/bugs/binutils/gotplt/foo

Breakpoint 1, main () at main.c:5
5  void (*f)(void) = foo;
(gdb) disass
Dump of assembler code for function main:
   0x0040058d <+0>: push   %rbp
   0x0040058e <+1>: mov%rsp,%rbp
   0x00400591 <+4>: push   %rbx
   0x00400592 <+5>: sub$0x18,%rsp
   0x00400596 <+9>: lea-0x7(%rip),%rbx# 0x400596 
   0x0040059d <+16>: movabs $0x20042a,%r11
   0x004005a7 <+26>: add%r11,%rbx
=> 0x004005aa <+29>: movabs $0xfff8,%rax
   0x004005b4 <+39>: mov(%rbx,%rax,1),%rax
   0x004005b8 <+43>: mov%rax,-0x18(%rbp)
   0x004005bc <+47>: mov-0x18(%rbp),%rax
   0x004005c0 <+51>: callq  *%rax
   0x004005c2 <+53>: movabs $0xffdffad0,%rax
   0x004005cc <+63>: add%rbx,%rax
   0x004005cf <+66>: callq  *%rax
   0x004005d1 <+68>: mov$0x0,%eax
   0x004005d6 <+73>: add$0x18,%rsp
   0x004005da <+77>: pop%rbx
   0x004005db <+78>: pop%rbp
   0x004005dc <+79>: retq
End of assembler dump.
(gdb) b *0x004005c0
Breakpoint 2 at 0x4005c0: file main.c, line 6.
(gdb) b *0x004005cf
Breakpoint 3 at 0x4005cf: file main.c, line 7.
(gdb) c
Continuing.

Breakpoint 2, 0x004005c0 in main () at main.c:6
6  f();
(gdb) p $rax
$5 = 140737352012384
(gdb) disass $rax
Dump of assembler code for function foo:
   0x77df9260 <+0>: push   %rbp
   0x77df9261 <+1>: mov%rsp,%rbp
   0x77df9264 <+4>: lea0x7(%rip),%rdi# 0x77df9272
   0x77df926b <+11>: callq  0x77df9250 
   0x77df9270 <+16>: pop%rbp
   0x77df9271 <+17>: retq
End of assembler dump.
(gdb) c
Continuing.
foo

Breakpoint 3, 0x004005cf in main () at main.c:7
7  foo();
(gdb) p $rax
$6 = 4195472
(gdb) disass $rax
Dump of assembler code for function foo@plt:
   0x00400490 <+0>: jmpq   *0x200552(%rip)# 0x6009e8

   0x00400496 <+6>: pushq  $0x2
   0x0040049b <+11>: jmpq   0x400460
End of assembler dump.
(gdb)

> That uses the second GOT slot, and:
>
> 004006ec :
>   4006ec:   55  push   %rbp
>   4006ed:   48 89 e5mov%rsp,%rbp
>   4006f0:   53  push   %rbx
>   4006f1:   48 83 ec 18 sub$0x18,%rsp
>   4006f5:   48 8d 1d f9 ff ff fflea-0x7(%rip),%rbx# 
> 4006f5 
>   4006fc:   49 bb 0b 09 20 00 00movabs $0x20090b,%r11
>   400703:   00 00 00
>   400706:   4c 01 dbadd%r11,%rbx
>   400709:   48 b8 f8 ff ff ff ffmovabs $0xfff8,%rax
>   400710:   ff ff ff
>   400713:   48 8b 04 03 mov(%rbx,%rax,1),%rax
>
> This uses the first slot at 0x600ff8.
>
> So, no, currently GOT and GOTPLT (at least how it's supposed to be
> implemented) are not equivalent.

GOT reference:

  void (*f)(void) = foo;
  f();

gives you the address of function, foo, in liby.so, without going through
PLT, while

foo()

is called via PLT.  For function call, we must use PLT.  For pointer
reference, we don't use PLT slot:

1. We don't need the indirect branch in PLT.
2. All pointer references to the same function should have the same
value.

One way to optimize it is to make PLT entry to use the normal GOT
slot:

jmp  *name@GOTPCREL(%rip)
8 byte nop

where name@GOTPCREL points to the normal GOT slot
updated by R_X86_64_GLOB_DAT relocation at run-time.
Should I give it a try?


-- 
H.J.


Re: gcc-4_9 inlines less funcs than gcc-4_8 because of used_as_abstract_origin flag.

2014-11-19 Thread Wei Mi
Submit a bug here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63970

Wei.

On Wed, Nov 19, 2014 at 2:18 AM, Richard Biener
 wrote:
> On Wed, Nov 19, 2014 at 8:21 AM, Richard Biener
>  wrote:
>> On November 19, 2014 8:13:09 AM CET, Wei Mi  wrote:
>>>We see an inline problem as below caused by r201408
>>>(https://gcc.gnu.org/ml/gcc-patches/2013-08/msg00027.html).
>>>
>>>hoo() {
>>>  foo();
>>>  ...
>>>}
>>>
>>>foo {
>>>  goo();
>>>  ...
>>>}
>>>
>>>foo is func splitted, so its body changes to
>>>
>>>foo {
>>>  goo();
>>>  ...
>>>  foo.part();
>>>}
>>>
>>>and the used_as_abstract_origin of cgraph node of foo will be set to
>>>true after func splitting.
>>>
>>>In ipa-inline, when inlining foo into hoo, the original node of foo
>>>will not be reused as clone node because used_as_abstract_origin of
>>>cgraph node of foo is true and can_remove_node_now_p_1 will return
>>>false, so that a new clone node of foo will be created. This is the
>>>case in gcc-4_9.
>>>In gcc-4_8, the original node of foo will be reused as clone node.
>>>
>>>gcc-4_8
>>>foo
>>>  |
>>>goo
>>>
>>>gcc-4_9
>>>foofoo_clone
>>>\/
>>>  goo
>>>
>>>Because of the difference of whether to create a new clone for foo,
>>>when inlining goo to foo, the overall growth of inlining all callsites
>>>of goo in gcc-4_8 will be less than gcc-4_9 (goo has two callsites in
>>>gcc-4_9 but only one in gcc-4_8). If we have many cases like this,
>>>gcc-4_8 will actually have more inline growth budget than gcc-4_9 and
>>>will inline more aggressively than gcc-4_9.
>>>
>>>I don't understand the exact usage of the check about
>>>node->used_as_abstract_origin in can_remove_node_now_p_1, but I feel
>>>puzzled about following two points:
>>>
>>>1. https://gcc.gnu.org/ml/gcc-patches/2013-08/msg00027.html said the
>>>patch was to ensure all abstract origin functions do have nodes
>>>attached. However, even if the node of origin function is reused as a
>>>clone node, a new clone node will be created in following code in
>>>symbol_table::remove_unreachable_nodes if only the node that needs
>>>abstract origin is reachable.
>>>
>>>  if (TREE_CODE (node->decl) == FUNCTION_DECL
>>>  && DECL_ABSTRACT_ORIGIN (node->decl))
>>>{
>>>  struct cgraph_node *origin_node
>>> = cgraph_node::get_create (DECL_ABSTRACT_ORIGIN (node->decl));
>>>  origin_node->used_as_abstract_origin = true;
>>>  enqueue_node (origin_node, &first, &reachable);
>>>}
>>>
>>>2. DECL_ABSTRACT_ORIGIN(decl) seems only useful for debug info of
>>>clone nodes. But now the check of used_as_abstract_origin affect
>>>inline decisions, which should be the same with or without keeping
>>>debug info.
>>
>> I think we need to keep the functions but do not need to account for them in 
>> the unit size if we otherwise could remove them
>
> Btw - please make sure to open a bug so this issue doesn't get lost.
>
> Thanks,
> Richard.
>
>> Richard.
>>
>>>Thanks,
>>>Wei.
>>
>>


About the GCC mirror on igor.onlinedirect.bg

2014-11-19 Thread igor

Hello,
I write to inform you that unfortunately OnlineDirect (the sponsoring 
company)

was acquired and the Igor machine will be stopped in the coming weeks.

Best regards,
Igor team



gcc-4.9-20141119 is now available

2014-11-19 Thread gccadmin
Snapshot gcc-4.9-20141119 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.9-20141119/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.9 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch 
revision 217816

You'll find:

 gcc-4.9-20141119.tar.bz2 Complete GCC

  MD5=52192d74d3b8009885130d9beb234aeb
  SHA1=fc53988f6b268ddaa01a9cdaf5570794fa4af963

Diffs from 4.9-20141112 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.9
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: What is R_X86_64_GOTPLT64 used for?

2014-11-19 Thread H.J. Lu
On Wed, Nov 19, 2014 at 8:38 AM, H.J. Lu  wrote:
>
> One way to optimize it is to make PLT entry to use the normal GOT
> slot:
>
> jmp  *name@GOTPCREL(%rip)
> 8 byte nop
>
> where name@GOTPCREL points to the normal GOT slot
> updated by R_X86_64_GLOB_DAT relocation at run-time.
> Should I give it a try?

I turned out that we can reuse BND PLT.  I implemented it in BFD ld
on hjl/plt.got branch:

https://sourceware.org/git/?p=binutils-gdb.git;a=shortlog;h=refs/heads/hjl/plt.got

I tested it on glibc and it works.  It should work with all models.  Please
give it a try.


-- 
H.J.


Re: What is R_X86_64_GOTPLT64 used for?

2014-11-19 Thread H.J. Lu
On Wed, Nov 19, 2014 at 3:54 PM, H.J. Lu  wrote:
> On Wed, Nov 19, 2014 at 8:38 AM, H.J. Lu  wrote:
>>
>> One way to optimize it is to make PLT entry to use the normal GOT
>> slot:
>>
>> jmp  *name@GOTPCREL(%rip)
>> 8 byte nop
>>
>> where name@GOTPCREL points to the normal GOT slot
>> updated by R_X86_64_GLOB_DAT relocation at run-time.
>> Should I give it a try?
>
> I turned out that we can reuse BND PLT.  I implemented it in BFD ld
> on hjl/plt.got branch:
>
> https://sourceware.org/git/?p=binutils-gdb.git;a=shortlog;h=refs/heads/hjl/plt.got
>
> I tested it on glibc and it works.  It should work with all models.  Please
> give it a try.

I spoke too soon.  I found a problem and I will investigate it.

-- 
H.J.