Re: Help with implementing Wine optimization experiment
On 08/14/2016 08:23 AM, Daniel Santos wrote: ms_abi_push_regs: pop%rax push %rdi push %rsi sub$0xa8,%rsp movaps %xmm6,(%rsp) movaps %xmm7,0x10(%rsp) movaps %xmm8,0x20(%rsp) movaps %xmm9,0x30(%rsp) movaps %xmm10,0x40(%rsp) movaps %xmm11,0x50(%rsp) movaps %xmm12,0x60(%rsp) movaps %xmm13,0x70(%rsp) movaps %xmm14,0x80(%rsp) movaps %xmm15,0x90(%rsp) jmp *(%rax) I think this will be quite slow because it breaks the return stack optimization in the CPU. I think you should push the return address and use RET. Florian
Re: Help with implementing Wine optimization experiment
On Mon, Aug 15, 2016 at 2:16 AM, Jeff Law wrote: > On 08/14/2016 01:57 AM, Trevor Saunders wrote: >> >> On Sun, Aug 14, 2016 at 01:23:16AM -0500, Daniel Santos wrote: >>> >>> I'm experimenting with ways to optimize wine (x86 target only) and I >>> believe >>> I can shrink wine's total text size by around 7% by outlining the lengthy >>> pro- and epilogues required for ms_abi functions making sysv_abi calls. >>> Theoretically, fewer instruction cache misses will offset the extra 4 >>> instructions per function and result in a net performance gain. However, >>> I'm >>> new to the gcc project and a novice x86 assembly programmer as well (have >>> been wanting to work on gcc for a while now!) In short, I want to: >>> >>> 1. Replace the prologue that pushes di, sp and xmm6-15 with a single call >>> to >>> a global "ms_abi_push_regs" routine >>> 2. Replace the epilogue that pops these regs with a jmp to a global >>> "ms_abi_pop_regs" routine >>> 3. Add the two routines somewhere so that they are linked into the >>> output. >> >> >> I think you want to put those into libgcc then. > > Right. That's what I've done with out-of-line prologues/epilogues in the > past. In the static part, of course. Not sure if we always have/link that on x86_64/i?86. Richard. > Jeff
Re: Help with implementing Wine optimization experiment
Am 14.08.2016 um 08:23 schrieb Daniel Santos: > I'm experimenting with ways to optimize wine (x86 target only) and I believe > I can shrink wine's total text size by around 7% by outlining the lengthy > pro- and epilogues required for ms_abi functions making sysv_abi calls. > Theoretically, fewer instruction cache misses will offset the extra 4 > instructions per function and result in a net performance gain. However, I'm > new to the gcc project and a novice x86 assembly programmer as well (have > been wanting to work on gcc for a while now!) In short, I want to: > > 1. Replace the prologue that pushes di, sp and xmm6-15 with a single call to > a global "ms_abi_push_regs" routine > 2. Replace the epilogue that pops these regs with a jmp to a global > "ms_abi_pop_regs" routine > 3. Add the two routines somewhere so that they are linked into the output. > > I have this working in a small-scale experiment (writing the ms_abi function > in assembly), but I'm not certain how I would add these routines. Should I > make them built-ins? > > I have found the code that adds the clobber RTL instructions in > ix86_expand_call() (gcc/config/i386/i386.c:25832), and I see that > thread_prologue_and_epilogue_insns() (gcc/function.c) is where these clobbers > are expanded into the prologue and epilogue, but I'm not sure what the > cleanest way to convert this is. My thought was to replace the clobber_reg() > calls with one that would add an insn_call, or would it be better to do this > in thread_prologue_and_epilogue_insns() where prologue and epilogue > generation belongs? But that function is for all targets. Any pointers > greatly appreciated! > Hi, Thanks for working on this, but I haven't seen some discussion on wine-devel recently. I'm also not an expert on that area, but isn't this risking to break copy protections and hotpatching. Just wanted to remind you about those two things, so the implementation will be usefull.
GCC 6.2 Release Candidate available from gcc.gnu.org
The first release candidate for the second release from the GCC 6 branch, GCC 6.2.0, from SVN revision r239476, is available from ftp://gcc.gnu.org/pub/gcc/snapshots/6.2.0-RC-20160815/ and shortly its mirrors. I have so far bootstrapped and tested the release candidate on x86_64-unknown-linux-gnu. Please test it and report any issues to bugzilla. If all goes well, we'd like to release GCC 6.2.0 early next week.
GCC 6.2 Status Report (2016-08-15)
Status == The GCC 6 branch is now frozen for blocking regressions and documentation fixes only, all changes to the branch require a RM approval now. Quality Data Priority # Change from last report --- --- P10 P2 127 - 8 P3 15 + 5 P4 116 - 5 P5 31 + 1 --- --- Total P1-P3 142 - 3 Total 289 - 7 Previous Report === https://gcc.gnu.org/ml/gcc/2016-08/msg00031.html
Re: [gimplefe] "Unknown tree: c_maybe_const_expr" error while parsing conditional expression
On 11 August 2016 at 15:58, Richard Biener wrote: > On Thu, Aug 11, 2016 at 7:47 AM, Prasad Ghangal > wrote: >> In this patch I am trying to parse gimple call. But I am getting weird >> gimple dump for that. >> >> for this testcase: >> int __GIMPLE() bar() >> { >> int a; >> a = a + 1; >> return a; >> } >> >> void __GIMPLE() foo() >> { >> int b; >> b = bar(); >> } >> >> I am getting ssa dump as: >> >> /* Function bar (bar, funcdef_no=0, decl_uid=1744, cgraph_uid=0, >> symbol_order=0)*/ >> >> int >> bar () >> { >> struct FRAME.bar FRAME.0; >> int a; >> void * D_1754; >> void * _3; >> >> bb_2: >> _3 = __builtin_dwarf_cfa (0); >> FRAME.0.FRAME_BASE.PARENT = _3; >> a_6 = a_5(D) + 1; >> return a_6; >> >> } >> >> >> >> /* Function foo (foo, funcdef_no=1, decl_uid=1747, cgraph_uid=1, >> symbol_order=1)*/ >> >> void >> foo () >> { >> int b; >> >> bb_2: >> b_3 = bar (); >> return; >> >> } >> > > Somehow foo is treated as nested in bar. Note this even happens > without calls if you > have two functions in the testcase. Usually this means after > finishing parsing of a function > you fail to reset. Looks like the following fixes it: > > diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c > index 95615bc..b35eada 100644 > --- a/gcc/c/c-parser.c > +++ b/gcc/c/c-parser.c > @@ -2164,6 +2165,8 @@ c_parser_declaration_or_fndef (c_parser *parser, bool > fnde > f_ok, > c_parser_parse_gimple_body (parser); > in_late_binary_op = saved; > cgraph_node::finalize_function (current_function_decl, false); > + set_cfun (NULL); > + current_function_decl = NULL; > timevar_pop (tv); > return; > } > > Richard. > I have updated the patch and committed along with testcases Thanks, Prasad >> >> On 9 August 2016 at 14:37, Richard Biener wrote: >>> On Sun, Aug 7, 2016 at 3:19 PM, Prasad Ghangal >>> wrote: On 4 August 2016 at 18:29, Richard Biener wrote: > On Thu, Aug 4, 2016 at 1:31 PM, Prasad Ghangal > wrote: >> On 2 August 2016 at 14:29, Richard Biener >> wrote: >>> On Mon, Aug 1, 2016 at 4:52 PM, Prasad Ghangal >>> wrote: Hi, I am trying to replace c_parser_paren_condition (parser) in c_parser_gimple_if_stmt by c_parser_gimple_paren_condition (parser) as described in the patch I am trying test case void __GIMPLE () foo () { int a; bb_2: if (a == 2) goto bb_3; else goto bb_4; bb_3: a_2 = 4; bb_4: return; } but it fails to parse gimple expression and produces error as /home/prasad/test3.c: In function ‘foo’: /home/prasad/test3.c:1:18: error: invalid operands in gimple comparison void __GIMPLE () foo () ^~~ if (<<< Unknown tree: c_maybe_const_expr a >>> == 2) goto bb_3; else goto bb_4; /home/prasad/test3.c:1:18: internal compiler error: verify_gimple failed I failed to debug where it is setting to C_MAYBE_CONST_EXPR >>> >>> It's in parsing the binary expression. Btw, you don't need >>> lvalue_to_rvalue >>> conversion or truthvalue conversion - source that would require this >>> would >>> not be valid GIMPLE. Let me try to debug: >>> >>> >>> (gdb) p debug_tree (cond.value) >>> >> type >> size >>> unit size >>> align 32 symtab 0 alias set -1 canonical type 0x7688b7e0 >>> precision 32 min max >>> >>> pointer_to_this > >>> >>> arg 0 >> 0x7688b7e0 int> >>> >>> arg 1 >> 0x7688b7e0 int> >>> used SI file t.c line 3 col 7 size >> 0x76887ee8 32> unit size >>> align 32 context >> >>> arg 1 >> 0x7688b7e0 int> constant 2> >>> t.c:5:9 start: t.c:5:7 finish: t.c:5:12> >>> $5 = void >>> (gdb) b ggc-page.c:1444 if result == 0x76997938 >>> Breakpoint 6 at 0x8a0d3e: file >>> /space/rguenther/src/gcc_gimple_fe/gcc/ggc-page.c, line 1444. >>> (gdb) run >>> >>> Breakpoint 6, ggc_internal_alloc (size=40, f=0x0, s=0, n=1) >>> at /space/rguenther/src/gcc_gimple_fe/gcc/ggc-page.c:1444 >>> 1444 return result; >>> (gdb) fin (a few times) >>> Run till exit from #0 0x011821b7 in build2_stat ( >>> code=C_MAYBE_CONST_EXPR, tt=, >>> arg0=, arg1=) >>> at /space/rguenther/src/gcc_gimple_fe/gcc/tree.c:4466 >>> 0x0081d263 in c_wrap_maybe_const (expr=>> a>, >>> non_const=false) >>> at /space/rguenther/src/gcc_gimple_fe/gcc/c-family/c-common.c:4359 >>> 4359 expr = build2 (C_MAYBE_CONST_EXPR, TREE_TYPE (expr), NULL, >>> expr); >
Re: Help with implementing Wine optimization experiment
On 08/15/2016 06:35 AM, André Hentschel wrote: Hi, Thanks for working on this, but I haven't seen some discussion on wine-devel recently. I'm also not an expert on that area, but isn't this risking to break copy protections and hotpatching. Just wanted to remind you about those two things, so the implementation will be usefull. Thanks for your response! I've run into the hot-patching code a lot while working on this. Not breaking these should be easy since these functions are explicitly marked with the ms_hook_prologue attribute, so I can just skip altering these functions for now. This attribute is assigned in Wine via expansion of the macro DECLSPEC_HOTPATCH and I'm currently only counting 171 such functions. I'm not sure about breaking copy protections however and I don't really know what the issues are pertaining to this. I'm mostly doing this as an experiment for now, and admittedly as an excuse to finally start hacking away at gcc, which I've been wanting to do for several years now. Until I know more about what the various copy protection mechanisms look for, I'm going to ignore it and address it later. Presuming that this experiment turns out to be useful, it might be implemented as a function attribute so that functions that need to appear a certain way to copy protection software can omit the optimization, similar to ms_hook_prologue. Daniel
Re: Help with implementing Wine optimization experiment
On 08/15/2016 05:56 AM, Richard Biener wrote: On Mon, Aug 15, 2016 at 2:16 AM, Jeff Law wrote: On 08/14/2016 01:57 AM, Trevor Saunders wrote: On Sun, Aug 14, 2016 at 01:23:16AM -0500, Daniel Santos wrote: I'm experimenting with ways to optimize wine (x86 target only) and I believe I can shrink wine's total text size by around 7% by outlining the lengthy pro- and epilogues required for ms_abi functions making sysv_abi calls. Theoretically, fewer instruction cache misses will offset the extra 4 instructions per function and result in a net performance gain. However, I'm new to the gcc project and a novice x86 assembly programmer as well (have been wanting to work on gcc for a while now!) In short, I want to: 1. Replace the prologue that pushes di, sp and xmm6-15 with a single call to a global "ms_abi_push_regs" routine 2. Replace the epilogue that pops these regs with a jmp to a global "ms_abi_pop_regs" routine 3. Add the two routines somewhere so that they are linked into the output. I think you want to put those into libgcc then. Right. That's what I've done with out-of-line prologues/epilogues in the past. In the static part, of course. Not sure if we always have/link that on x86_64/i?86. Richard. Thanks all! Well, Wine's libs certainly do not appear to be dynamically linked with libgcc, and I didn't know that it had such a static portion, so thank you for this! Getting this in will solve half of the problem. Also, I should have mentioned that I see this as a stop-gap to actually performing static analysis and completely disabling floating point in Wine's libs where ever possible, but that's a much larger project. So I'm hoping to be able to show some improvements from this mechanism. Daniel