Hi,

On Tue, Jul 20 2021, Richard Biener wrote:
> On Tue, Jul 20, 2021 at 10:54 AM JiangNing OS via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>> > -----Original Message-----
>> > From: Gcc-patches <gcc-patches-
>> > bounces+jiangning=os.amperecomputing....@gcc.gnu.org> On Behalf Of
>> > Martin Jambor
>> > Sent: Wednesday, June 30, 2021 4:19 AM
>> > To: GCC Patches <gcc-patches@gcc.gnu.org>
>> > Cc: Jan Hubicka <hubi...@ucw.cz>
>> > Subject: [RFC] ipa: Adjust references to identify read-only globals
>> >
>> > Hi,
>> >
>> > this patch has been motivated by SPEC 2017's 544.nab_r in which there is a
>> > static variable which is never written to and so zero throughout the 
>> > run-time
>> > of the benchmark.  However, it is passed by reference to a function in 
>> > which
>> > it is read and (after some multiplications) passed into __builtin_exp 
>> > which in
>> > turn unnecessarily consumes almost 10% of the total benchmark run-time.
>>
>> I do see ~8.5% runtime reduction on aarch64.
>>
>> > The situation is illustrated by the added testcase remref-3.c.
>> >
>> > The patch adds a flag to ipa-prop descriptor of each parameter to mark such
>> > parameters.  IPA-CP and inling then take the effort to remove IPA_REF_ADDR
>> > references in the caller and only add IPA_REF_LOAD reference to the
>> > clone/overall inlined function.  This is sufficient for subsequent symbol 
>> > table
>> > analysis code to identify the read-only variable as such and optimize the 
>> > code.
>> >
>> > I plan to compile a number of packages with the patch to test it some more
>> > and get a bit better idea of its impact.  But it has passed bootstrap,
>> > LTObootstrap and testing on x86_64-linux and i686-linux and so unless I 
>> > find
>> > any problem, I would like to commit it at some point next month without any
>> > major changes, so I'd be grateful for any feedback even now.
>>
>> I see 3 cases in SPEC2017 failed to compile on aarch64, i.e. 521.wrf_r, 
>> 527.cam4_r, 554.roms_r. For example,
>>
>> pre_step3d.fppized.f90:1260:35: internal compiler error: Segmentation fault
>>  1260 |       CALL wclock_on (ng, iNLM, 22)
>>       |                                   ^
>> 0x1645c6b internal_error(char const*, ...)
>>         ???:0
>> 0xe1f4f4 place_block_symbol(rtx_def*)
>>         ???:0
>> 0x84ab33 use_anchored_address(rtx_def*)
>>         ???:0
>> 0x868203 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
>> expand_modifier, rtx_def**, bool)
>>         ???:0
>> 0x868793 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
>> expand_modifier, rtx_def**, bool)
>>         ???:0
>> 0x75b593 expand_call(tree_node*, rtx_def*, int)
>>         ???:0
>> 0x86a09f expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
>> expand_modifier, rtx_def**, bool)
>>         ???:0
>> Please submit a full bug report
>
> Please file a bugreport and provide a (possibly reduced) testcase.
>

The patch is not yet committed, so I don't think a bug-report (in
bugzilla) is in order.


At least after I fixed a bug pointed out in Honza's review, I cannot
replicate any ICE building any of 521.wrf_r, 527.cam4_r, 554.roms_r on
x86_64, at least without LTO.  But with LTO, I get an undefined symbol
link error building 527.cam4_r which is of course certainly a bug in the
patch.  I will investigate and hopefully fix it and re-post the patch
but then I would appreciate if you checked it on aarch64 for me.

Thanks,

Martin

Reply via email to