On Tue, Jul 20, 2021 at 10:54 AM JiangNing OS via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > > -----Original Message----- > > From: Gcc-patches <gcc-patches- > > bounces+jiangning=os.amperecomputing....@gcc.gnu.org> On Behalf Of > > Martin Jambor > > Sent: Wednesday, June 30, 2021 4:19 AM > > To: GCC Patches <gcc-patches@gcc.gnu.org> > > Cc: Jan Hubicka <hubi...@ucw.cz> > > Subject: [RFC] ipa: Adjust references to identify read-only globals > > > > Hi, > > > > this patch has been motivated by SPEC 2017's 544.nab_r in which there is a > > static variable which is never written to and so zero throughout the > > run-time > > of the benchmark. However, it is passed by reference to a function in which > > it is read and (after some multiplications) passed into __builtin_exp which > > in > > turn unnecessarily consumes almost 10% of the total benchmark run-time. > > I do see ~8.5% runtime reduction on aarch64. > > > The situation is illustrated by the added testcase remref-3.c. > > > > The patch adds a flag to ipa-prop descriptor of each parameter to mark such > > parameters. IPA-CP and inling then take the effort to remove IPA_REF_ADDR > > references in the caller and only add IPA_REF_LOAD reference to the > > clone/overall inlined function. This is sufficient for subsequent symbol > > table > > analysis code to identify the read-only variable as such and optimize the > > code. > > > > I plan to compile a number of packages with the patch to test it some more > > and get a bit better idea of its impact. But it has passed bootstrap, > > LTObootstrap and testing on x86_64-linux and i686-linux and so unless I find > > any problem, I would like to commit it at some point next month without any > > major changes, so I'd be grateful for any feedback even now. > > I see 3 cases in SPEC2017 failed to compile on aarch64, i.e. 521.wrf_r, > 527.cam4_r, 554.roms_r. For example, > > pre_step3d.fppized.f90:1260:35: internal compiler error: Segmentation fault > 1260 | CALL wclock_on (ng, iNLM, 22) > | ^ > 0x1645c6b internal_error(char const*, ...) > ???:0 > 0xe1f4f4 place_block_symbol(rtx_def*) > ???:0 > 0x84ab33 use_anchored_address(rtx_def*) > ???:0 > 0x868203 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, > expand_modifier, rtx_def**, bool) > ???:0 > 0x868793 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, > expand_modifier, rtx_def**, bool) > ???:0 > 0x75b593 expand_call(tree_node*, rtx_def*, int) > ???:0 > 0x86a09f expand_expr_real_1(tree_node*, rtx_def*, machine_mode, > expand_modifier, rtx_def**, bool) > ???:0 > Please submit a full bug report
Please file a bugreport and provide a (possibly reduced) testcase. Thanks, Richard. > Thanks, > -Jiangning