> Am 12.04.2022 um 17:08 schrieb Thomas Schwinge <tho...@codesourcery.com>:
> 
> Hi!
> 
>> On 2022-04-12T15:45:03+0200, Richard Biener <rguent...@suse.de> wrote:
>>> On Tue, 12 Apr 2022, Thomas Schwinge wrote:
>>> On 2022-04-07T15:04:15+0200, Richard Biener via Gcc-patches 
>>> <gcc-patches@gcc.gnu.org> wrote:
>>>> On Thu, 7 Apr 2022, Jan Hubicka wrote:
>>>>>> On Thu, 7 Apr 2022, Jan Hubicka wrote:
>>>>>>> this patch fixes miscompilation of gnatmake.  Modref attempts to track 
>>>>>>> memory
>>>>>>> accesses relative to the base pointers which are parameters of 
>>>>>>> functions.
>>>>>>> If it fails, it still makes difference between unknown memory access and
>>>>>>> global memory access.  The second makes it possible to disambiguate with
>>>>>>> memory that is not accessible from outside world (i.e. everything that 
>>>>>>> does
>>>>>>> not escape from the caller function).  This is useful so we do not punt
>>>>>>> when unknown function is called.
>>>>>>> 
>>>>>>> Now I added ref_may_access_global_memory_p to tree-ssa-alias whic is 
>>>>>>> using
>>>>>>> ptr_deref_may_alias_global_p.  There is however a shift in meaning of 
>>>>>>> this
>>>>>>> predicate: the second tests that the dereference may alias with global 
>>>>>>> variable.
>>>>>>> 
>>>>>>> In the testcase we are disambiguating heap allocated escaping memory 
>>>>>>> which is
>>>>>>> not a global variable but it is still a global memory in the modref's 
>>>>>>> sense.
>>>>>>> So we need to test in addition contains_escaped.
>>>>>>> 
>>>>>>> The patch simply copies logic from the predicate and adds the check.
>>>>>>> I am not sure if there is better way to handle this?
>>>>>> 
>>>>>> I'm testing the following variant which exposes this detail
>>>>>> (escaped local memory global or not) in the APIs that say "global"
>>>>>> which allows to remove ref_may_access_global_memory_p.
>>>>> 
>>>>> Thank you.  Indeed it is better to have an explicit flag, since the
>>>>> clash of names is bit sensitive.
>>>> 
>>>> OK - bootstrapped / tested on x86_64-unknown-linux-gnu including Ada
>>>> and now pushed.
>>> 
>>> This commit r12-8048-g8c0ebaf9f586100920a3c0849fb10e9985d7ae58
>>> "ipa/104303 - miscompilation of gnatmake" is causing one regression in
>>> nvptx offload testing:
>>> 
>>>    [...]
>>>    [-PASS:-]{+FAIL:+} libgomp.oacc-fortran/private-variables.f90 
>>> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O1   
>>> at line 142 (test for bogus messages, line 131)
>>>    [...]
>>> 
>>> I've done a before/after 'diff' of
>>> '-fdump-tree-all -foffload-options=nvptx-none=-fdump-tree-all'
>>> with all functions and calls other than 't4' commented out.
>> 
>> I suppose the
>> 
>> diff --git a/gcc/tree-ssa-dce.cc b/gcc/tree-ssa-dce.cc
>> index 2a13ea34829..34ce8abe33a 100644
>> --- a/gcc/tree-ssa-dce.cc
>> +++ b/gcc/tree-ssa-dce.cc
>> @@ -315,7 +315,7 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool
>> aggressive)
>>     }
>> 
>>   if ((gimple_vdef (stmt) && keep_all_vdefs_p ())
>> -      || stmt_may_clobber_global_p (stmt))
>> +      || stmt_may_clobber_global_p (stmt, true))
>>     {
>>       mark_stmt_necessary (stmt, true);
>>       return;
>> 
>> was overly conservative (I was probably misled by keep_all_vdefs_p ()),
>> passing false to stmt_may_clobber_global_p should fix the regression?
> 
> It does, thanks.  Per more 'diff'ing, this change again enables
> the '-O1' host compilation 'a-private-variables.f90.117t.dce2' to
> clean this up, and likewise for nvptx offload compilation
> 'a.xnvptx-none.mkoffload.117t.dce2', thus the extra diagnostic
> again disappears.  In fact, for all optimization flags variants
> that I've tried, we've again got the exactly same dumps as before
> commit r12-8048-g8c0ebaf9f586100920a3c0849fb10e9985d7ae58
> "ipa/104303 - miscompilation of gnatmake"!
> 
> So I'll run that through standard bootstrap/regression testing, and push?
> Got a suggestion for rationale to put into the commit log?

I have already tested and pushed it.

Richard.

> 
> Grüße
> Thomas
> 
> 
>>> For '-O0', there's no difference at all.
>>> 
>>> For '-O1', for host compilation we see:
>>> 
>>>    diff -ru 0-O1/a-private-variables.f90.117t.dce2 
>>> ./a-private-variables.f90.117t.dce2
>>>    --- 0-O1/a-private-variables.f90.117t.dce2      2022-04-12 
>>> 08:36:54.525302868 +0200
>>>    +++ ./a-private-variables.f90.117t.dce2 2022-04-12 12:51:43.726304109 
>>> +0200
>>>    @@ -30,9 +30,13 @@
>>> 
>>>       <bb 3> [local count: 87490071]:
>>>       # .offset.15_2 = PHI <0(2), .offset.15_63(5)>
>>>    +  pt.x = .offset.15_2;
>>>       _25 = .offset.15_2 * 2;
>>>    +  pt.y = _25;
>>>       _27 = .offset.15_2 * 4;
>>>    +  pt.z = _27;
>>>       _29 = .offset.15_2 * 6;
>>>    +  pt.attr[4] = _29;
>>> 
>>>       <bb 4> [local count: 1073741824]:
>>>       # .offset.10_4 = PHI <0(3), .offset.10_56(4)>
>>>    diff -ru 0-O1/a-private-variables.f90.118t.stdarg 
>>> ./a-private-variables.f90.118t.stdarg
>>>    --- 0-O1/a-private-variables.f90.118t.stdarg    2022-04-12 
>>> 08:36:54.525302868 +0200
>>>    +++ ./a-private-variables.f90.118t.stdarg       2022-04-12 
>>> 12:51:43.726304109 +0200
>>>    @@ -4,6 +4,7 @@
>>>     __attribute__((oacc function (1, 1, 1), oacc parallel, omp target 
>>> entrypoint))
>>>     void t4_._omp_fn.0 (const struct .omp_data_t.1 & restrict .omp_data_i)
>>>     {
>>>    +  struct vec3 pt;
>>>       integer(kind=4) .offset.15_2;
>>>       integer(kind=4) .offset.10_4;
>>>       integer(kind=4) _25;
>>>    @@ -25,9 +26,13 @@
>>> 
>>>       <bb 3> [local count: 87490071]:
>>>       # .offset.15_2 = PHI <0(2), .offset.15_63(5)>
>>>    +  pt.x = .offset.15_2;
>>>       _25 = .offset.15_2 * 2;
>>>    +  pt.y = _25;
>>>       _27 = .offset.15_2 * 4;
>>>    +  pt.z = _27;
>>>       _29 = .offset.15_2 * 6;
>>>    +  pt.attr[4] = _29;
>>> 
>>>       <bb 4> [local count: 1073741824]:
>>>       # .offset.10_4 = PHI <0(3), .offset.10_56(4)>
>>>    [Similar for following passes/dumps.]
>>>    diff -ru 0-O1/a-private-variables.f90.141t.lim2 
>>> ./a-private-variables.f90.141t.lim2
>>>    --- 0-O1/a-private-variables.f90.141t.lim2      2022-04-12 
>>> 08:36:54.525302868 +0200
>>>    +++ ./a-private-variables.f90.141t.lim2 2022-04-12 12:51:43.730304125 
>>> +0200
>>>    @@ -24,11 +24,42 @@
>>>     ;; 5 succs { 7 6 }
>>>     ;; 7 succs { 3 }
>>>     ;; 6 succs { 1 }
>>>    +
>>>    +Symbols to be put in SSA form
>>>    +{ D.4340 D.4356 D.4357 D.4358 D.4359 D.4360 D.4361 D.4362 D.4363 }
>>>    +Incremental SSA update started at block: 0
>>>    +Number of blocks in CFG: 9
>>>    +Number of blocks to update: 8 ( 89%)
>>>    +
>>>    +
>>>    +
>>>    +SSA replacement table
>>>    +N_i -> { O_1 ... O_j } means that N_i replaces O_1, ..., O_j
>>>    +
>>>    +pt_x_lsm.22_1 -> { pt_x_lsm.22_72 }
>>>    +pt_z_lsm.24_20 -> { pt_z_lsm.24_9 }
>>>    +pt_attr_I_lsm.25_21 -> { pt_attr_I_lsm.25_10 }
>>>    +pt_y_lsm.23_22 -> { pt_y_lsm.23_73 }
>>>    +Incremental SSA update started at block: 3
>>>    +Number of blocks in CFG: 9
>>>    +Number of blocks to update: 3 ( 33%)
>>>    +
>>>    +
>>>     __attribute__((oacc function (1, 1, 1), oacc parallel, omp target 
>>> entrypoint))
>>>     void t4_._omp_fn.0 (const struct .omp_data_t.1 & restrict .omp_data_i)
>>>     {
>>>    +  integer(kind=4) D.4363;
>>>    +  integer(kind=4) pt_attr_I_lsm.25;
>>>    +  integer(kind=4) D.4361;
>>>    +  integer(kind=4) pt_z_lsm.24;
>>>    +  integer(kind=4) D.4359;
>>>    +  integer(kind=4) pt_y_lsm.23;
>>>    +  integer(kind=4) D.4357;
>>>    +  integer(kind=4) pt_x_lsm.22;
>>>    +  struct vec3 pt;
>>>       integer(kind=4) .offset.15_2;
>>>       integer(kind=4) .offset.10_4;
>>>    +  integer(kind=4) _7(D);
>>>       integer(kind=4) _25;
>>>       integer(kind=4) _27;
>>>       integer(kind=4) _29;
>>>    @@ -37,21 +68,32 @@
>>>       integer(kind=8) _43;
>>>       integer(kind=4)[1025] * _45;
>>>       integer(kind=4) _46;
>>>    +  integer(kind=4) _47(D);
>>>       integer(kind=4) _48;
>>>       integer(kind=4) _50;
>>>    +  integer(kind=4) _51(D);
>>>       integer(kind=4) _52;
>>>       integer(kind=4) _54;
>>>       integer(kind=4) .offset.10_56;
>>>       integer(kind=4) .offset.15_63;
>>>    +  integer(kind=4) _70(D);
>>> 
>>>       <bb 2> [local count: 7128820]:
>>>    +  pt_x_lsm.22_8 = _7(D);
>>>    +  pt_y_lsm.23_49 = _47(D);
>>>    +  pt_z_lsm.24_53 = _51(D);
>>>    +  pt_attr_I_lsm.25_71 = _70(D);
>>>       _45 = *.omp_data_i_44(D).arr;
>>> 
>>>       <bb 3> [local count: 87490071]:
>>>       # .offset.15_2 = PHI <0(2), .offset.15_63(7)>
>>>    +  pt_x_lsm.22_72 = .offset.15_2;
>>>       _25 = .offset.15_2 * 2;
>>>    +  pt_y_lsm.23_73 = _25;
>>>       _27 = .offset.15_2 * 4;
>>>    +  pt_z_lsm.24_9 = _27;
>>>       _29 = .offset.15_2 * 6;
>>>    +  pt_attr_I_lsm.25_10 = _29;
>>>       _41 = .offset.15_2 * 32;
>>> 
>>>       <bb 4> [local count: 1073741824]:
>>>    @@ -84,6 +126,14 @@
>>>       goto <bb 3>; [100.00%]
>>> 
>>>       <bb 6> [local count: 35644102]:
>>>    +  # pt_z_lsm.24_20 = PHI <pt_z_lsm.24_9(5)>
>>>    +  # pt_attr_I_lsm.25_21 = PHI <pt_attr_I_lsm.25_10(5)>
>>>    +  # pt_x_lsm.22_1 = PHI <pt_x_lsm.22_72(5)>
>>>    +  # pt_y_lsm.23_22 = PHI <pt_y_lsm.23_73(5)>
>>>    +  pt.attr[4] = pt_attr_I_lsm.25_21;
>>>    +  pt.z = pt_z_lsm.24_20;
>>>    +  pt.y = pt_y_lsm.23_22;
>>>    +  pt.x = pt_x_lsm.22_1;
>>>       return;
>>> 
>>>     }
>>>    [Similar for following passes/dumps.]
>>>    diff -ru 0-O1/a-private-variables.f90.148t.dse3 
>>> ./a-private-variables.f90.148t.dse3
>>>    --- 0-O1/a-private-variables.f90.148t.dse3      2022-04-12 
>>> 08:36:54.525302868 +0200
>>>    +++ ./a-private-variables.f90.148t.dse3 2022-04-12 12:51:43.730304125 
>>> +0200
>>>    @@ -1,11 +1,24 @@
>>> 
>>>     ;; Function t4_._omp_fn.0 (t4_._omp_fn.0, funcdef_no=3, decl_uid=4278, 
>>> cgraph_uid=4, symbol_order=3)
>>> 
>>>    +Removing basic block 7
>>>    +Removing basic block 8
>>>    +Removing basic block 9
>>>     __attribute__((oacc function (1, 1, 1), oacc parallel, omp target 
>>> entrypoint))
>>>     void t4_._omp_fn.0 (const struct .omp_data_t.1 & restrict .omp_data_i)
>>>     {
>>>    +  integer(kind=4) D.4363;
>>>    +  integer(kind=4) pt_attr_I_lsm.25;
>>>    +  integer(kind=4) D.4361;
>>>    +  integer(kind=4) pt_z_lsm.24;
>>>    +  integer(kind=4) D.4359;
>>>    +  integer(kind=4) pt_y_lsm.23;
>>>    +  integer(kind=4) D.4357;
>>>    +  integer(kind=4) pt_x_lsm.22;
>>>    +  struct vec3 pt;
>>>       integer(kind=4) .offset.15_2;
>>>       integer(kind=4) .offset.10_4;
>>>    +  integer(kind=4) _7(D);
>>>       integer(kind=4) _25;
>>>       integer(kind=4) _27;
>>>       integer(kind=4) _29;
>>>    @@ -14,25 +27,28 @@
>>>       integer(kind=8) _43;
>>>       integer(kind=4)[1025] * _45;
>>>       integer(kind=4) _46;
>>>    +  integer(kind=4) _47(D);
>>>       integer(kind=4) _48;
>>>       integer(kind=4) _50;
>>>    +  integer(kind=4) _51(D);
>>>       integer(kind=4) _52;
>>>       integer(kind=4) _54;
>>>       integer(kind=4) .offset.10_56;
>>>       integer(kind=4) .offset.15_63;
>>>    +  integer(kind=4) _70(D);
>>> 
>>>       <bb 2> [local count: 7128820]:
>>>       _45 = *.omp_data_i_44(D).arr;
>>> 
>>>       <bb 3> [local count: 87490071]:
>>>    -  # .offset.15_2 = PHI <0(2), .offset.15_63(7)>
>>>    +  # .offset.15_2 = PHI <0(2), .offset.15_63(5)>
>>>       _25 = .offset.15_2 * 2;
>>>       _27 = .offset.15_2 * 4;
>>>       _29 = .offset.15_2 * 6;
>>>       _41 = .offset.15_2 * 32;
>>> 
>>>       <bb 4> [local count: 1073741824]:
>>>    -  # .offset.10_4 = PHI <0(3), .offset.10_56(8)>
>>>    +  # .offset.10_4 = PHI <0(3), .offset.10_56(4)>
>>>       _42 = .offset.10_4 + _41;
>>>       _43 = (integer(kind=8)) _42;
>>>       _46 = (*_45)[_43];
>>>    @@ -43,23 +59,17 @@
>>>       (*_45)[_43] = _54;
>>>       .offset.10_56 = .offset.10_4 + 1;
>>>       if (.offset.10_56 <= 31)
>>>    -    goto <bb 8>; [89.00%]
>>>    +    goto <bb 4>; [89.00%]
>>>       else
>>>         goto <bb 5>; [11.00%]
>>> 
>>>    -  <bb 8> [local count: 955630224]:
>>>    -  goto <bb 4>; [100.00%]
>>>    -
>>>       <bb 5> [local count: 437450365]:
>>>       .offset.15_63 = .offset.15_2 + 1;
>>>       if (.offset.15_63 <= 31)
>>>    -    goto <bb 7>; [89.00%]
>>>    +    goto <bb 3>; [89.00%]
>>>       else
>>>         goto <bb 6>; [11.00%]
>>> 
>>>    -  <bb 7> [local count: 389330825]:
>>>    -  goto <bb 3>; [100.00%]
>>>    -
>>>       <bb 6> [local count: 35644102]:
>>>       return;
>>> 
>>>    [Similar for following passes/dumps.]
>>> 
>>> ..., so in 'a-private-variables.f90.148t.dse3', the 'pt.{x,y,z,attr}'
>>> assignments for the new 'struct vec3 pt;' get cleaned out, so that should
>>> all be fine; no actual changes in the end.
>>> 
>>> Comparing '-O1' nvptx offload target compilation before/after, the first
>>> difference is in 'a.xnvptx-none.mkoffload.117t.dce2': similar to host
>>> compilation.  But then, in the following things do not get cleaned up as
>>> they do for the host compilation; the 'pt.{x,y,z,attr}' assignments for
>>> the new 'struct vec3 pt;' persist:
>>> 
>>>    diff -ru 0-O1/a.xnvptx-none.mkoffload.252t.optimized 
>>> ./a.xnvptx-none.mkoffload.252t.optimized
>>>    --- 0-O1/a.xnvptx-none.mkoffload.252t.optimized 2022-04-12 
>>> 08:36:54.569303204 +0200
>>>    +++ ./a.xnvptx-none.mkoffload.252t.optimized    2022-04-12 
>>> 12:51:43.774304292 +0200
>>>    @@ -7,34 +7,36 @@
>>>     __attribute__((oacc function (32, 8, 32), oacc parallel, omp target 
>>> entrypoint))
>>>     void t4_._omp_fn.0 (const struct .omp_data_t.1 & restrict .omp_data_i)
>>>     {
>>>    -  unsigned int ivtmp$6;
>>>    +  unsigned int ivtmp$8;
>>>       unsigned int ivtmp$5;
>>>    +  unsigned int ivtmp$3;
>>>       int D.1527;
>>>       int D.1524;
>>>    +  struct vec3 pt;
>>>       int _2;
>>>    +  int[1025] * _4;
>>>    +  int _22;
>>>       int _23;
>>>       int _25;
>>>       int _27;
>>>       int _29;
>>>       int _34;
>>>    -  int[1025] * _43;
>>>    -  sizetype _45;
>>>    -  sizetype _46;
>>>    -  int[1025] * _47;
>>>    -  unsigned int _48;
>>>    +  sizetype _41;
>>>    +  unsigned int _42;
>>>    +  unsigned int _43;
>>>    +  unsigned int _45;
>>>    +  int _46;
>>>       int _49;
>>>    -  unsigned int _50;
>>>       int _51;
>>>    -  int _52;
>>>    -  int _53;
>>>    -  int _63;
>>>    -  int _82;
>>>    -  int _83;
>>>    -  int _87;
>>>    +  int _54;
>>>    +  sizetype _63;
>>>       int _95;
>>>    +  int _96;
>>>    +  int _99;
>>>       int _102;
>>>    -  int _103;
>>>    +  int[1025] * _103;
>>>       int _104;
>>>    +  int _105;
>>>       int _107;
>>> 
>>>       <bb 2> [local count: 7128820]:
>>>    @@ -47,43 +49,50 @@
>>>         goto <bb 7>; [73.00%]
>>> 
>>>       <bb 3> [local count: 1924781]:
>>>    -  _87 = _104 * 32;
>>>    -  ivtmp$5_80 = (unsigned int) _87;
>>>    -  _52 = _104 * 2;
>>>    -  ivtmp$6_54 = (unsigned int) _52;
>>>    +  ivtmp$3_61 = (unsigned int) _104;
>>>    +  _54 = _104 * 2;
>>>    +  ivtmp$5_56 = (unsigned int) _54;
>>>    +  _46 = _104 * 32;
>>>    +  ivtmp$8_48 = (unsigned int) _46;
>>>    +  _42 = (unsigned int) _23;
>>> 
>>>       <bb 4> [local count: 87490071]:
>>>    -  # _2 = PHI <_104(3), _63(6)>
>>>    -  # ivtmp$5_22 = PHI <ivtmp$5_80(3), ivtmp$5_81(6)>
>>>    -  # ivtmp$6_72 = PHI <ivtmp$6_54(3), ivtmp$6_56(6)>
>>>    -  _25 = (int) ivtmp$6_72;
>>>    -  _50 = ivtmp$6_72 * 2;
>>>    -  _27 = (int) _50;
>>>    -  _48 = ivtmp$6_72 * 3;
>>>    -  _29 = (int) _48;
>>>    +  # ivtmp$3_106 = PHI <ivtmp$3_61(3), ivtmp$3_47(6)>
>>>    +  # ivtmp$5_87 = PHI <ivtmp$5_56(3), ivtmp$5_72(6)>
>>>    +  # ivtmp$8_52 = PHI <ivtmp$8_48(3), ivtmp$8_50(6)>
>>>    +  _2 = (int) ivtmp$3_106;
>>>    +  pt.x = _2;
>>>    +  _25 = (int) ivtmp$5_87;
>>>    +  pt.y = _25;
>>>    +  _45 = ivtmp$5_87 * 2;
>>>    +  _27 = (int) _45;
>>>    +  pt.z = _27;
>>>    +  _43 = ivtmp$5_87 * 3;
>>>    +  _29 = (int) _43;
>>>    +  pt.attr[4] = _29;
>>>       _34 = .UNIQUE (OACC_FORK, 0, 2);
>>> 
>>>       <bb 5> [local count: 437450365]:
>>>       _107 = .GOACC_DIM_POS (2);
>>>    -  _82 = (int) ivtmp$5_22;
>>>    -  _83 = _82 + _107;
>>>    -  _47 = *.omp_data_i_44(D).arr;
>>>    -  _46 = (sizetype) _83;
>>>    -  _45 = _46 * 4;
>>>    -  _43 = _47 + _45;
>>>    -  _49 = MEM <int> [(int[1025] *)_43];
>>>    -  _51 = _2 + _49;
>>>    -  _53 = _25 + _51;
>>>    -  _103 = _27 + _53;
>>>    -  _95 = _29 + _103;
>>>    -  MEM <int> [(int[1025] *)_43] = _95;
>>>    +  _49 = (int) ivtmp$8_52;
>>>    +  _51 = _49 + _107;
>>>    +  _103 = *.omp_data_i_44(D).arr;
>>>    +  _63 = (sizetype) _51;
>>>    +  _41 = _63 * 4;
>>>    +  _4 = _103 + _41;
>>>    +  _95 = MEM <int> [(int[1025] *)_4];
>>>    +  _96 = _2 + _95;
>>>    +  _22 = _25 + _96;
>>>    +  _105 = _22 + _27;
>>>    +  _99 = _29 + _105;
>>>    +  MEM <int> [(int[1025] *)_4] = _99;
>>>       .UNIQUE (OACC_JOIN, _34, 2);
>>> 
>>>       <bb 6> [local count: 87490071]:
>>>    -  _63 = _2 + 1;
>>>    -  ivtmp$5_81 = ivtmp$5_22 + 32;
>>>    -  ivtmp$6_56 = ivtmp$6_72 + 2;
>>>    -  if (_23 != _63)
>>>    +  ivtmp$3_47 = ivtmp$3_106 + 1;
>>>    +  ivtmp$5_72 = ivtmp$5_87 + 2;
>>>    +  ivtmp$8_50 = ivtmp$8_52 + 32;
>>>    +  if (_42 != ivtmp$3_47)
>>>         goto <bb 4>; [89.00%]
>>>       else
>>>         goto <bb 7>; [11.00%]
>>> 
>>> ..., and thus the change in diagnostics:
>>> 
>>>     [...]
>>>     
>>> source-gcc/libgomp/testsuite/libgomp.oacc-fortran/private-variables.f90:131:63:
>>>  note: variable ‘pt’ ought to be adjusted for OpenACC privatization level: 
>>> ‘gang’
>>>     
>>> source-gcc/libgomp/testsuite/libgomp.oacc-fortran/private-variables.f90:131:63:
>>>  note: variable ‘pt’ ought to be adjusted for OpenACC privatization level: 
>>> ‘gang’
>>>    
>>> +source-gcc/libgomp/testsuite/libgomp.oacc-fortran/private-variables.f90: 
>>> In function ‘t4_._omp_fn.0’:
>>>    
>>> +source-gcc/libgomp/testsuite/libgomp.oacc-fortran/private-variables.f90:131:63:
>>>  note: variable ‘pt’ adjusted for OpenACC privatization level: ‘gang’
>>>    +  131 |   !$acc loop gang private(pt) ! { dg-line l_loop[incr c_loop] }
>>>    +      |                                                               ^
>>> 
>>> For '-O2', host compilation begins same as '-O1', and again in
>>> 'a-private-variables.f90.148t.dse3', the 'pt.{x,y,z,attr}' assignments
>>> for the new 'struct vec3 pt;' get cleaned out:
>>> 
>>>    --- ./a-private-variables.f90.144t.sink1        2022-04-12 
>>> 14:28:19.173520425 +0200
>>>    +++ ./a-private-variables.f90.148t.dse3 2022-04-12 14:28:19.173520425 
>>> +0200
>>>    [...]
>>>     __attribute__((oacc function (1, 1, 1), oacc parallel, omp target 
>>> entrypoint))
>>>     void t4_._omp_fn.0 (const struct .omp_data_t.1 & restrict .omp_data_i)
>>>     {
>>>    @@ -97,10 +74,6 @@
>>>       goto <bb 3>; [100.00%]
>>> 
>>>       <bb 6> [local count: 35644102]:
>>>    -  pt.attr[4] = _29;
>>>    -  pt.z = _27;
>>>    -  pt.y = _25;
>>>    -  pt.x = .offset.15_2;
>>>       return;
>>> 
>>>     }
>>>    [...]
>>> 
>>> For '-O2', nvptx offload target compilation looks very similar to host
>>> compilation, and again in 'a.xnvptx-none.mkoffload.148t.dse3', the
>>> 'pt.{x,y,z,attr}' assignments for the new 'struct vec3 pt;' get cleaned
>>> out:
>>> 
>>>    --- ./a.xnvptx-none.mkoffload.144t.sink1        2022-04-12 
>>> 14:28:19.213520366 +0200
>>>    +++ ./a.xnvptx-none.mkoffload.148t.dse3 2022-04-12 14:28:19.213520366 
>>> +0200
>>>    [...]
>>>     __attribute__((oacc function (32, 8, 32), oacc parallel, omp target 
>>> entrypoint))
>>>     void t4_._omp_fn.0 (const struct .omp_data_t.1 & restrict .omp_data_i)
>>>     {
>>>    @@ -34,13 +25,9 @@
>>> 
>>>       <bb 2> [local count: 7128820]:
>>>       _104 = .GOACC_DIM_POS (0);
>>>    -  pt.x = _104;
>>>       _25 = _104 * 2;
>>>    -  pt.y = _25;
>>>       _27 = _104 * 4;
>>>    -  pt.z = _27;
>>>       _29 = _104 * 6;
>>>    -  pt.attr[4] = _29;
>>>       _34 = .UNIQUE (OACC_FORK, 0, 2);
>>> 
>>>       <bb 3> [local count: 437450365]:
>>> 
>>> ..., so no actual changes in the end.
>>> 
>>> I have not verified other ("higher") optimization levels, but given no
>>> change in diagnostics, I suppose the same ("no actual changes") happens
>>> for those.
>>> 
>>> Is the '-O1' change/regression unexpected, and should be analyzed, or
>>> should we just accept the slightly worse code generation (for '-O1'
>>> only), and I accordingly adjust the test case for the change in
>>> diagnostics?
>>> 
>>> 
>>> Grüße
>>> Thomas
>>> -----------------
>>> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 
>>> 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: 
>>> Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; 
>>> Registergericht München, HRB 106955
>>> 
>> 
>> --
>> Richard Biener <rguent...@suse.de>
>> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
>> Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)
> -----------------
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955

Reply via email to