Hi Jeff, Thanks for the suggestion. I did a bootstrap x86_64 build before and after my patch and looked for differences in the last stage object files and there were plenty of them. I chose a nice simple function (check_callers) from ipa-inline-analysis.c and reduced it to a small test case. Hope this is ok.
Testing done: * aarch64 built, "make check" no regressions * aarch64_be built, "make check" no regressions * x86_64 built, "make check" no regressions ChangeLog: 2015-05-15 David Sherwood <david.sherw...@arm.com> * loop-invariant.c (create_new_invariant): Don't calculate address cost if mode is not a scalar integer. (get_inv_cost): Increase computational cost for unused invariants. * testsuite/gcc.dg/loop-invariant.c: New testcase. Regards, David. > -----Original Message----- > From: Jeff Law [mailto:l...@redhat.com] > Sent: 11 May 2015 20:27 > To: David Sherwood; gcc-patches@gcc.gnu.org > Subject: Re: [Patch][loop-invariant.c] Fix a couple of bugs regarding loop > invariant motion discovered by > spec2k6 on aarch64 > > On 05/11/2015 06:02 AM, David Sherwood wrote: > > Hi, > > > > This patch fixes a couple of issues I saw during the compilation of > > shell_lam.f > > for 410.bwaves test in spec2006: > > > > * create_new_invariant: We shouldn't bother attempting to calculate the > > address > > cost for something that clearly isn't an address. Use a SCALAR_INT_MODE_P > > check > > to eliminate the most obvious cases, such as vector modes and so on. > > * get_inv_cost: Change the code so that we don't treat something as an > > address > > if it is never actually used as an address, i.e. we didn't use it at all. > > This > > occurs due to the way we process the innermost loop first - invariants get > > pulled out of inner most loop and then get processed during next innermost > > loop. > > Here is more detail from shell_lam.f.210r.loop2_invariant dump file that > > shows what currently happens: > > > > 4427 *****ending processing of loop 27 ****** > > > > 4442 (const_vector:V2DF [ > > 4443 (const_double:DF 4.0e+0 [0x0.8p+3]) > > 4444 (const_double:DF 4.0e+0 [0x0.8p+3]) > > 4445 ]) > > 4446 > > 4447 Hot cost: 8 (final) > > 4448 (set (reg:V2DF 3179) > > 4449 (const_vector:V2DF [ > > 4450 (const_double:DF 4.0e+0 [0x0.8p+3]) > > 4451 (const_double:DF 4.0e+0 [0x0.8p+3]) > > 4452 ])) > > 4453 > > 4454 Hot cost: 8 (final) > > 4455 Set in insn 2764 is invariant (1), cost 8, depends on > > > > 4471 Decided to move invariant 1 -- gain 8 > > > > move_invariant_reg moves invariant (the const_vector above) into 3490 and > > replaces use of 3179 with 3490, but conservatively keeps the definition of > > 3179, > > which gets processed in outer loop, but is never used. > > > > Debug for outer loop now follows: > > > > 4497 *****starting processing of loop 26 ****** > > > > 6379 (insn 2764 2748 2766 110 (set (reg:V2DF 3490) > > 6380 (const_vector:V2DF [ > > 6381 (const_double:DF 4.0e+0 [0x0.8p+3]) > > 6382 (const_double:DF 4.0e+0 [0x0.8p+3]) > > 6383 ])) shell_lam.f:287 819 {*aarch64_simd_movv2df} > > 6384 (nil)) > > 6385 ;; UD chains for insn luid 88 uid 2766 > > > > 6604 ;; UD chains for insn luid 21 uid 3836 > > 6605 ;; reg 3490 { d419(bb 110 insn 2764) } > > 6606 (insn 3836 2763 2765 111 (set (reg:V2DF 3179) > > 6607 (reg:V2DF 3490)) shell_lam.f:287 -1 > > 6608 (nil)) > > ^^^ additional insn generated by move_invariant_reg from loop 27 > > > > 6836 ;; UD chains for insn luid 40 uid 2790 > > 6837 ;; reg 3161 { d272(bb 111 insn 2746) } > > 6838 ;; reg 3201 { d305(bb 111 insn 2788) } > > 6839 ;; reg 3490 { d419(bb 110 insn 2764) } > > 6840 ;; eq_note reg 3161 { d272(bb 111 insn 2746) } > > 6841 ;; eq_note reg 3201 { d305(bb 111 insn 2788) } > > 6842 (insn 2790 2788 2791 111 (set (reg:V2DF 3203 [ vect__930.427 ]) > > 6843 (fma:V2DF (reg:V2DF 3161 [ MEM[base: vectp_q.371_2137, > > index: ivtmp.679_3214, > > offset: 0B] ]) > > 6844 (neg:V2DF (reg:V2DF 3490)) > > 6845 (reg:V2DF 3201 [ vect__928.424 ]))) shell_lam.f:287 > > 1219 {fnmav2df4} > > 6846 (expr_list:REG_DEAD (reg:V2DF 3201 [ vect__928.424 ]) > > 6847 (expr_list:REG_DEAD (reg:V2DF 3179) ... > > ^^^ 3179 marked as DEAD > > > > 8366 *****ending processing of loop 26 ****** > > > > 8472 (const_vector:V2DF [ > > 8473 (const_double:DF 4.0e+0 [0x0.8p+3]) > > 8474 (const_double:DF 4.0e+0 [0x0.8p+3]) > > 8475 ]) > > 8476 > > 8477 Hot cost: 8 (final) > > 8478 (set (reg:V2DF 3490) > > 8479 (const_vector:V2DF [ > > 8480 (const_double:DF 4.0e+0 [0x0.8p+3]) > > 8481 (const_double:DF 4.0e+0 [0x0.8p+3]) > > 8482 ])) > > 8483 > > 8484 Hot cost: 8 (final) > > 8485 Set in insn 2764 is invariant (12), cost 8, depends on > > > > 8505 (set (reg:V2DF 3179) > > 8506 (reg:V2DF 3490)) > > 8507 > > 8508 Hot cost: 8 (final) > > 8509 Set in insn 3836 is invariant (15), cost 8, depends on 12 > > ^^^ Never moves 3179 invariant out, even though it's never used. > > > > After my patch we now get this: > > > > 8586 Decided to move invariant 15 -- gain 16 > > 8587 Decided to move dependent invariant 12 > > > > since get_inv_cost sees the number of uses is zero and the dependent > > invariant > > (12) gets moved too. Of course, without my patch the definition of 3179 > > would > > ultimately be eliminated as dead code in a later pass anyway. > > > > Is this ok to go in? > > > > Regards, > > David Sherwood. > > > > ChangeLog entry follows ... > > > > 2015-05-08 David Sherwood <david.sherw...@arm.com> > > > > * loop-invariant.c (create_new_invariant): Don't calculate address > > cost > > if mode is not scalar integers. > > (get_inv_cost): Increase computational cost for unused invariants. > So you'd need to bootstrap and regression test this change for it to be > fully ready for review. > > It would also be good to include a test for the testsuite where you can > show the improvements in behaviour due to this change. I'm not offhand > sure of the license on bwaves, so you may or may not be able to reduce > it and use the reduced code in the GCC testsuite. > > With some instrumentation, you may easily find the second case > triggering code differences in GCC itself, which you could then reduce > and use as a test for the second change. > > Conceptually both hunks seem reasonable, so really it's just a matter of > building some testcases to verify behaviour and the bootstrap/regression > test. > > There's a number of ARM engineers contributing to GCC these days that > can probably help. > > Jeff > > > >
loop-invariant2.patch
Description: Binary data