Re: [gimple-classes, committed 4/6] tree-ssa-tail-merge.c: Use gassign
On Tue, Nov 18, 2014 at 10:04 PM, David Malcolm wrote: > On Tue, 2014-11-18 at 10:53 +0100, Richard Biener wrote: >> On Tue, Nov 18, 2014 at 2:59 AM, David Malcolm wrote: >> > On Mon, 2014-11-17 at 11:06 +0100, Richard Biener wrote: >> >> On Sat, Nov 15, 2014 at 12:00 PM, David Malcolm >> >> wrote: >> >> > On Thu, 2014-11-13 at 11:45 +0100, Richard Biener wrote: >> >> >> On Thu, Nov 13, 2014 at 2:41 AM, David Malcolm >> >> >> wrote: >> >> >> > On Tue, 2014-11-11 at 11:43 +0100, Richard Biener wrote: >> >> >> >> On Tue, Nov 11, 2014 at 8:26 AM, Jakub Jelinek >> >> >> >> wrote: >> >> >> >> > On Mon, Nov 10, 2014 at 05:27:50PM -0500, David Malcolm wrote: >> >> >> >> >> On Sat, 2014-11-08 at 14:56 +0100, Jakub Jelinek wrote: >> >> >> >> >> > On Sat, Nov 08, 2014 at 01:07:28PM +0100, Richard Biener wrote: >> >> >> >> >> > > To be constructive here - the above case is from within a >> >> >> >> >> > > GIMPLE_ASSIGN case label >> >> >> >> >> > > and thus I'd have expected >> >> >> >> >> > > >> >> >> >> >> > > case GIMPLE_ASSIGN: >> >> >> >> >> > > { >> >> >> >> >> > > gassign *a1 = as_a (s1); >> >> >> >> >> > > gassign *a2 = as_a (s2); >> >> >> >> >> > > lhs1 = gimple_assign_lhs (a1); >> >> >> >> >> > > lhs2 = gimple_assign_lhs (a2); >> >> >> >> >> > > if (TREE_CODE (lhs1) != SSA_NAME >> >> >> >> >> > > && TREE_CODE (lhs2) != SSA_NAME) >> >> >> >> >> > > return (operand_equal_p (lhs1, lhs2, 0) >> >> >> >> >> > > && gimple_operand_equal_value_p >> >> >> >> >> > > (gimple_assign_rhs1 (a1), >> >> >> >> >> > > >> >> >> >> >> > > gimple_assign_rhs1 (a2))); >> >> >> >> >> > > else if (TREE_CODE (lhs1) == SSA_NAME >> >> >> >> >> > >&& TREE_CODE (lhs2) == SSA_NAME) >> >> >> >> >> > > return vn_valueize (lhs1) == vn_valueize (lhs2); >> >> >> >> >> > > return false; >> >> >> >> >> > > } >> >> >> >> >> > > >> >> >> >> >> > > instead. That's the kind of changes I have expected and >> >> >> >> >> > > have approved of. >> >> >> >> >> > >> >> >> >> >> > But even that looks like just adding extra work for all >> >> >> >> >> > developers, with no >> >> >> >> >> > gain. You only have to add extra code and extra temporaries, >> >> >> >> >> > in switches >> >> >> >> >> > typically also have to add {} because of the temporaries and >> >> >> >> >> > thus extra >> >> >> >> >> > indentation level, and it doesn't simplify anything in the >> >> >> >> >> > code. >> >> >> >> >> >> >> >> >> >> The branch attempts to use the C++ typesystem to capture >> >> >> >> >> information >> >> >> >> >> about the kinds of gimple statement we expect, both: >> >> >> >> >> (A) so that the compiler can detect type errors, and >> >> >> >> >> (B) as a comprehension aid to the human reader of the code >> >> >> >> >> >> >> >> >> >> The ideal here is when function params and struct field can be >> >> >> >> >> strengthened from "gimple" to a subclass ptr. This captures the >> >> >> >> >> knowledge that every use of a function or within a struct has a >> >> >> >> >> given >> >> >> >> >> gimple code. >> >> >> >> > >> >> >> >> > I just don't like all the as_a/is_a stuff enforced everywhere, >> >> >> >> > it means more typing, more temporaries, more indentation. >> >> >> >> > So, as I view it, instead of the checks being done cheaply (yes, >> >> >> >> > I think >> >> >> >> > the gimple checking as we have right now is very cheap) under the >> >> >> >> > hood by the accessors (gimple_assign_{lhs,rhs1} etc.), those >> >> >> >> > changes >> >> >> >> > put the burden on the developers, who has to check that manually >> >> >> >> > through >> >> >> >> > the as_a/is_a stuff everywhere, more typing and uglier syntax. >> >> >> >> > I just don't see that as a step forward, instead a huge step >> >> >> >> > backwards. >> >> >> >> > But perhaps I'm alone with this. >> >> >> >> > Can you e.g. compare the size of - lines in your patchset >> >> >> >> > combined, and >> >> >> >> > size of + lines in your patchset? As in, if your changes lead to >> >> >> >> > less >> >> >> >> > typing or more. >> >> >> >> >> >> >> >> I see two ways out here. One is to add overloads to all the >> >> >> >> functions >> >> >> >> taking the special types like >> >> >> >> >> >> >> >> tree >> >> >> >> gimple_assign_rhs1 (gimple *); >> >> >> >> >> >> >> >> or simply add >> >> >> >> >> >> >> >> gassign *operator ()(gimple *g) { return as_a (g); } >> >> >> >> >> >> >> >> into a gimple-compat.h header which you include in places that >> >> >> >> are not converted "nicely". >> >> >> > >> >> >> > Thanks for the suggestions. >> >> >> > >> >> >> > Am I missing something, or is the gimple-compat.h idea above not >> >> >> > valid C >> >> >> > ++? >> >> >> > >> >> >> > Note that "gimple" is still a typedef to >> >> >> > gimple_statement_base * >> >> >> > (as noted before, the gimple -> gimple * change would break e
Re: gcc-4_9 inlines less funcs than gcc-4_8 because of used_as_abstract_origin flag.
On Wed, Nov 19, 2014 at 8:21 AM, Richard Biener wrote: > On November 19, 2014 8:13:09 AM CET, Wei Mi wrote: >>We see an inline problem as below caused by r201408 >>(https://gcc.gnu.org/ml/gcc-patches/2013-08/msg00027.html). >> >>hoo() { >> foo(); >> ... >>} >> >>foo { >> goo(); >> ... >>} >> >>foo is func splitted, so its body changes to >> >>foo { >> goo(); >> ... >> foo.part(); >>} >> >>and the used_as_abstract_origin of cgraph node of foo will be set to >>true after func splitting. >> >>In ipa-inline, when inlining foo into hoo, the original node of foo >>will not be reused as clone node because used_as_abstract_origin of >>cgraph node of foo is true and can_remove_node_now_p_1 will return >>false, so that a new clone node of foo will be created. This is the >>case in gcc-4_9. >>In gcc-4_8, the original node of foo will be reused as clone node. >> >>gcc-4_8 >>foo >> | >>goo >> >>gcc-4_9 >>foofoo_clone >>\/ >> goo >> >>Because of the difference of whether to create a new clone for foo, >>when inlining goo to foo, the overall growth of inlining all callsites >>of goo in gcc-4_8 will be less than gcc-4_9 (goo has two callsites in >>gcc-4_9 but only one in gcc-4_8). If we have many cases like this, >>gcc-4_8 will actually have more inline growth budget than gcc-4_9 and >>will inline more aggressively than gcc-4_9. >> >>I don't understand the exact usage of the check about >>node->used_as_abstract_origin in can_remove_node_now_p_1, but I feel >>puzzled about following two points: >> >>1. https://gcc.gnu.org/ml/gcc-patches/2013-08/msg00027.html said the >>patch was to ensure all abstract origin functions do have nodes >>attached. However, even if the node of origin function is reused as a >>clone node, a new clone node will be created in following code in >>symbol_table::remove_unreachable_nodes if only the node that needs >>abstract origin is reachable. >> >> if (TREE_CODE (node->decl) == FUNCTION_DECL >> && DECL_ABSTRACT_ORIGIN (node->decl)) >>{ >> struct cgraph_node *origin_node >> = cgraph_node::get_create (DECL_ABSTRACT_ORIGIN (node->decl)); >> origin_node->used_as_abstract_origin = true; >> enqueue_node (origin_node, &first, &reachable); >>} >> >>2. DECL_ABSTRACT_ORIGIN(decl) seems only useful for debug info of >>clone nodes. But now the check of used_as_abstract_origin affect >>inline decisions, which should be the same with or without keeping >>debug info. > > I think we need to keep the functions but do not need to account for them in > the unit size if we otherwise could remove them Btw - please make sure to open a bug so this issue doesn't get lost. Thanks, Richard. > Richard. > >>Thanks, >>Wei. > >
Re: Maxim Kuvyrkov appointed Android sub-port reviewer
On Nov 14, 2014, at 9:00 PM, H.J. Lu wrote: > On Sun, Apr 15, 2012 at 5:08 PM, David Edelsohn wrote: >>I am pleased to announce that the GCC Steering Committee has >> appointed Maxim Kuvyrkov as reviewer for the Android sub-port. >> >>Please join me in congratulating Maxim on his new role. >> Maxim, please update your listing in the MAINTAINERS file. >> >> Happy hacking! >> David >> > > Hi Maxim, > > Have you added your name to MAINTAINERS? Will do. Thank you, -- Maxim Kuvyrkov www.linaro.org
Re: What is R_X86_64_GOTPLT64 used for?
On Mon, Nov 17, 2014 at 6:14 AM, Michael Matz wrote: > Hi, > > On Thu, 13 Nov 2014, H.J. Lu wrote: > >> Linker does: >> >> ... code that looks like it might create just one GOT slot ... >> >> So if a symbol is accessed by both @GOT and @PLTOFF, its >> needs_plt will be true and its got.plt entry will be used for >> both @GOT and @GOTPLT. @GOTPLT has no advantage >> over @GOT, but potentially wastes a PLT entry. > > The above is not correct. Had you tried you'd see this: > > % cat x.c > extern void foo (void); > void main (void) > { > void (*f)(void) = foo; > f(); > foo(); > } > % gcc -fPIE -mcmodel=large -S x.c; cat x.s > ... > movabsq $foo@GOT, %rax > ... > movabsq $foo@PLTOFF, %rax > ... > > So, foo is access via @GOT offset and @PLTOFF. Then, > > % cat y.c > void foo (void) {} > % gcc -o liby.so -shared -fPIC y.c > % gcc -fPIE -mcmodel=large x.s liby.so > % readelf -r a.out > ... > 00600ff8 00040006 R_X86_64_GLOB_DAT foo + 0 > ... > 00601028 00040007 R_X86_64_JUMP_SLO foo + 0 > ... > > The first one (to 600ff8) is the normal GOT slot, the second one the GOT > slot for the PLT entry. Both are actually used: > > 004005f0 : > 4005f0: ff 25 32 0a 20 00 jmpq *0x200a32(%rip)# > 601028 <_GLOBAL_OFFSET_TABLE_+0x28> They are not: Starting program: /export/home/hjl/bugs/binutils/gotplt/foo Breakpoint 1, main () at main.c:5 5 void (*f)(void) = foo; (gdb) disass Dump of assembler code for function main: 0x0040058d <+0>: push %rbp 0x0040058e <+1>: mov%rsp,%rbp 0x00400591 <+4>: push %rbx 0x00400592 <+5>: sub$0x18,%rsp 0x00400596 <+9>: lea-0x7(%rip),%rbx# 0x400596 0x0040059d <+16>: movabs $0x20042a,%r11 0x004005a7 <+26>: add%r11,%rbx => 0x004005aa <+29>: movabs $0xfff8,%rax 0x004005b4 <+39>: mov(%rbx,%rax,1),%rax 0x004005b8 <+43>: mov%rax,-0x18(%rbp) 0x004005bc <+47>: mov-0x18(%rbp),%rax 0x004005c0 <+51>: callq *%rax 0x004005c2 <+53>: movabs $0xffdffad0,%rax 0x004005cc <+63>: add%rbx,%rax 0x004005cf <+66>: callq *%rax 0x004005d1 <+68>: mov$0x0,%eax 0x004005d6 <+73>: add$0x18,%rsp 0x004005da <+77>: pop%rbx 0x004005db <+78>: pop%rbp 0x004005dc <+79>: retq End of assembler dump. (gdb) b *0x004005c0 Breakpoint 2 at 0x4005c0: file main.c, line 6. (gdb) b *0x004005cf Breakpoint 3 at 0x4005cf: file main.c, line 7. (gdb) c Continuing. Breakpoint 2, 0x004005c0 in main () at main.c:6 6 f(); (gdb) p $rax $5 = 140737352012384 (gdb) disass $rax Dump of assembler code for function foo: 0x77df9260 <+0>: push %rbp 0x77df9261 <+1>: mov%rsp,%rbp 0x77df9264 <+4>: lea0x7(%rip),%rdi# 0x77df9272 0x77df926b <+11>: callq 0x77df9250 0x77df9270 <+16>: pop%rbp 0x77df9271 <+17>: retq End of assembler dump. (gdb) c Continuing. foo Breakpoint 3, 0x004005cf in main () at main.c:7 7 foo(); (gdb) p $rax $6 = 4195472 (gdb) disass $rax Dump of assembler code for function foo@plt: 0x00400490 <+0>: jmpq *0x200552(%rip)# 0x6009e8 0x00400496 <+6>: pushq $0x2 0x0040049b <+11>: jmpq 0x400460 End of assembler dump. (gdb) > That uses the second GOT slot, and: > > 004006ec : > 4006ec: 55 push %rbp > 4006ed: 48 89 e5mov%rsp,%rbp > 4006f0: 53 push %rbx > 4006f1: 48 83 ec 18 sub$0x18,%rsp > 4006f5: 48 8d 1d f9 ff ff fflea-0x7(%rip),%rbx# > 4006f5 > 4006fc: 49 bb 0b 09 20 00 00movabs $0x20090b,%r11 > 400703: 00 00 00 > 400706: 4c 01 dbadd%r11,%rbx > 400709: 48 b8 f8 ff ff ff ffmovabs $0xfff8,%rax > 400710: ff ff ff > 400713: 48 8b 04 03 mov(%rbx,%rax,1),%rax > > This uses the first slot at 0x600ff8. > > So, no, currently GOT and GOTPLT (at least how it's supposed to be > implemented) are not equivalent. GOT reference: void (*f)(void) = foo; f(); gives you the address of function, foo, in liby.so, without going through PLT, while foo() is called via PLT. For function call, we must use PLT. For pointer reference, we don't use PLT slot: 1. We don't need the indirect branch in PLT. 2. All pointer references to the same function should have the same value. One way to optimize it is to make PLT entry to use the normal GOT slot: jmp *name@GOTPCREL(%rip) 8 byte nop where name@GOTPCREL points to the normal GOT slot updated by R_X86_64_GLOB_DAT relocation at run-time. Should I give it a try? -- H.J.
Re: gcc-4_9 inlines less funcs than gcc-4_8 because of used_as_abstract_origin flag.
Submit a bug here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63970 Wei. On Wed, Nov 19, 2014 at 2:18 AM, Richard Biener wrote: > On Wed, Nov 19, 2014 at 8:21 AM, Richard Biener > wrote: >> On November 19, 2014 8:13:09 AM CET, Wei Mi wrote: >>>We see an inline problem as below caused by r201408 >>>(https://gcc.gnu.org/ml/gcc-patches/2013-08/msg00027.html). >>> >>>hoo() { >>> foo(); >>> ... >>>} >>> >>>foo { >>> goo(); >>> ... >>>} >>> >>>foo is func splitted, so its body changes to >>> >>>foo { >>> goo(); >>> ... >>> foo.part(); >>>} >>> >>>and the used_as_abstract_origin of cgraph node of foo will be set to >>>true after func splitting. >>> >>>In ipa-inline, when inlining foo into hoo, the original node of foo >>>will not be reused as clone node because used_as_abstract_origin of >>>cgraph node of foo is true and can_remove_node_now_p_1 will return >>>false, so that a new clone node of foo will be created. This is the >>>case in gcc-4_9. >>>In gcc-4_8, the original node of foo will be reused as clone node. >>> >>>gcc-4_8 >>>foo >>> | >>>goo >>> >>>gcc-4_9 >>>foofoo_clone >>>\/ >>> goo >>> >>>Because of the difference of whether to create a new clone for foo, >>>when inlining goo to foo, the overall growth of inlining all callsites >>>of goo in gcc-4_8 will be less than gcc-4_9 (goo has two callsites in >>>gcc-4_9 but only one in gcc-4_8). If we have many cases like this, >>>gcc-4_8 will actually have more inline growth budget than gcc-4_9 and >>>will inline more aggressively than gcc-4_9. >>> >>>I don't understand the exact usage of the check about >>>node->used_as_abstract_origin in can_remove_node_now_p_1, but I feel >>>puzzled about following two points: >>> >>>1. https://gcc.gnu.org/ml/gcc-patches/2013-08/msg00027.html said the >>>patch was to ensure all abstract origin functions do have nodes >>>attached. However, even if the node of origin function is reused as a >>>clone node, a new clone node will be created in following code in >>>symbol_table::remove_unreachable_nodes if only the node that needs >>>abstract origin is reachable. >>> >>> if (TREE_CODE (node->decl) == FUNCTION_DECL >>> && DECL_ABSTRACT_ORIGIN (node->decl)) >>>{ >>> struct cgraph_node *origin_node >>> = cgraph_node::get_create (DECL_ABSTRACT_ORIGIN (node->decl)); >>> origin_node->used_as_abstract_origin = true; >>> enqueue_node (origin_node, &first, &reachable); >>>} >>> >>>2. DECL_ABSTRACT_ORIGIN(decl) seems only useful for debug info of >>>clone nodes. But now the check of used_as_abstract_origin affect >>>inline decisions, which should be the same with or without keeping >>>debug info. >> >> I think we need to keep the functions but do not need to account for them in >> the unit size if we otherwise could remove them > > Btw - please make sure to open a bug so this issue doesn't get lost. > > Thanks, > Richard. > >> Richard. >> >>>Thanks, >>>Wei. >> >>
About the GCC mirror on igor.onlinedirect.bg
Hello, I write to inform you that unfortunately OnlineDirect (the sponsoring company) was acquired and the Igor machine will be stopped in the coming weeks. Best regards, Igor team
gcc-4.9-20141119 is now available
Snapshot gcc-4.9-20141119 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.9-20141119/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.9 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch revision 217816 You'll find: gcc-4.9-20141119.tar.bz2 Complete GCC MD5=52192d74d3b8009885130d9beb234aeb SHA1=fc53988f6b268ddaa01a9cdaf5570794fa4af963 Diffs from 4.9-20141112 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.9 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: What is R_X86_64_GOTPLT64 used for?
On Wed, Nov 19, 2014 at 8:38 AM, H.J. Lu wrote: > > One way to optimize it is to make PLT entry to use the normal GOT > slot: > > jmp *name@GOTPCREL(%rip) > 8 byte nop > > where name@GOTPCREL points to the normal GOT slot > updated by R_X86_64_GLOB_DAT relocation at run-time. > Should I give it a try? I turned out that we can reuse BND PLT. I implemented it in BFD ld on hjl/plt.got branch: https://sourceware.org/git/?p=binutils-gdb.git;a=shortlog;h=refs/heads/hjl/plt.got I tested it on glibc and it works. It should work with all models. Please give it a try. -- H.J.
Re: What is R_X86_64_GOTPLT64 used for?
On Wed, Nov 19, 2014 at 3:54 PM, H.J. Lu wrote: > On Wed, Nov 19, 2014 at 8:38 AM, H.J. Lu wrote: >> >> One way to optimize it is to make PLT entry to use the normal GOT >> slot: >> >> jmp *name@GOTPCREL(%rip) >> 8 byte nop >> >> where name@GOTPCREL points to the normal GOT slot >> updated by R_X86_64_GLOB_DAT relocation at run-time. >> Should I give it a try? > > I turned out that we can reuse BND PLT. I implemented it in BFD ld > on hjl/plt.got branch: > > https://sourceware.org/git/?p=binutils-gdb.git;a=shortlog;h=refs/heads/hjl/plt.got > > I tested it on glibc and it works. It should work with all models. Please > give it a try. I spoke too soon. I found a problem and I will investigate it. -- H.J.