Re: [PATCH v2 06/11] OpenMP: lvalue parsing for map clauses (C++)

2022-11-02 Thread Jakub Jelinek via Fortran
Hi!

Thanks for working on this!

On Tue, Nov 01, 2022 at 09:50:38PM +, Julian Brown wrote:
> > I think we should figure out when we should temporarily disable
> >   parser->omp_array_section_p = false;
> > and restore it afterwards to a saved value.  E.g.
> > cp_parser_lambda_expression seems like a good candidate, the fact that
> > OpenMP array sections are allowed say in map clause doesn't mean they
> > are allowed inside of lambdas and it would be especially hard when
> > the lambda is defining a separate function and the search for
> > OMP_ARRAY_SECTION probably wouldn't be able to discover those.
> > Other spots to consider might be statement expressions, perhaps type
> > definitions etc.
> 
> I've had a go at doing this -- several expression types now forbid
> array-section syntax (see new "bad-array-section-*" tests added). I'm
> afraid my C++ isn't quite up to figuring out how it's possible to
> define a type inside an expression (inside a map clause) if we forbid
> lambdas and statement expressions though -- can you give an example?

But we can't forbid lambdas inside of the map clause expressions,
they are certainly valid in OpenMP, and IMNSHO shouldn't disallow statement
expressions, people might not even know they use a statement expression,
they could just use some standard macro which uses a statement expression
under the hood.  Though your testcases look good.

> > This shouldn't be done just for OMP_CLAUSE_MAP, but for all the
> > other clauses that accept array sections, including
> > OMP_CLAUSE_DEPEND, OMP_CLAUSE_AFFINITY, OMP_CLAUSE_MAP, OMP_CLAUSE_TO,
> > OMP_CLAUSE_FROM, OMP_CLAUSE_INCLUSIVE, OMP_CLAUSE_EXCLUSIVE,
> > OMP_CLAUSE_USE_DEVICE_ADDR, OMP_CLAUSE_HAS_DEVICE_ADDR,
> > OMP_CLAUSE_*REDUCTION.
> 
> I'm not too sure about all of those -- Tobias points out that
> "INCLUSIVE", "EXCLUSIVE", *DEVICE* and *REDUCTION* take "variable list"
> item types, not "locator list", though sometimes with an array section
> being permitted (in OpenMP 5.2+).

That is true.  For the clauses that don't use locator lists but variable
lists but accept array sections there are strict restrictions on what one
can use, basically one can only have varname or varname[...] or
varname[...][...] etc. where ... is the normal array element or array
section syntax.  So, we probably should continue to parse them as now,
but we can use OMP_ARRAY_SECTION to hold what we've parsed or even share
code with parsing array sections and the [...] on those clauses.

> Tested (alongside next patch) with offloading to NVPTX -- with my
> previously-posted "address tokenization" patch also applied.

> 2022-11-01  Julian Brown  
> 
> gcc/c-family/
> * c-omp.cc (c_omp_address_inspector::map_supported_p): Handle
>   OMP_ARRAY_SECTION.
> 
> gcc/cp/
>   * constexpr.cc (potential_consant_expression_1): Handle
>   OMP_ARRAY_SECTION.
> * error.cc (dump_expr): Handle OMP_ARRAY_SECTION.
> * parser.cc (cp_parser_new): Initialize parser->omp_array_section_p.
>   (cp_parser_statement_expr): Disallow array sections.
> (cp_parser_postfix_open_square_expression): Support OMP_ARRAY_SECTION
> parsing.
>   (cp_parser_parenthesized_expression_list, cp_parser_lambda_expression,
>   cp_parser_braced_list): Disallow array sections.
> (cp_parser_omp_var_list_no_open): Remove ALLOW_DEREF parameter, add
> MAP_LVALUE in its place.  Supported generalised lvalue parsing for
>   OpenMP map, to and from clauses.
> (cp_parser_omp_var_list): Remove ALLOW_DEREF parameter, add 
> MAP_LVALUE.
> Pass to cp_parser_omp_var_list_no_open.
> (cp_parser_oacc_data_clause, cp_parser_omp_all_clauses): Update calls
> to cp_parser_omp_var_list.
>   (cp_parser_omp_clause_map): Add sk_omp scope around
>   cp_parser_omp_var_list_no_open call.
> * parser.h (cp_parser): Add omp_array_section_p field.
> * semantics.cc (handle_omp_array_sections_1): Handle more types of map
> expression.
> (handle_omp_array_section): Handle non-DECL_P attachment points.
> (finish_omp_clauses): Check for supported types of expression.
> 
> gcc/
> * tree-pretty-print.c (dump_generic_node): Support OMP_ARRAY_SECTION.
> * tree.def (OMP_ARRAY_SECTION): New tree code.
> 
> gcc/testsuite/
> * c-c++-common/gomp/map-6.c: Update expected output.
>   * g++.dg/gomp/bad-array-section-1.C: New test.
>   * g++.dg/gomp/bad-array-section-2.C: New test.
>   * g++.dg/gomp/bad-array-section-3.C: New test.
>   * g++.dg/gomp/bad-array-section-4.C: New test.
>   * g++.dg/gomp/bad-array-section-5.C: New test.
>   * g++.dg/gomp/bad-array-section-6.C: New test.
>   * g++.dg/gomp/bad-array-section-7.C: New test.
>   * g++.dg/gomp/bad-array-section-8.C: New test.
>   * g++.dg/gomp/bad-array-section-9.C: New test.
>   * g++.dg/gomp/has_device_addr-non-lvalue-1.C: New test.
> * g++.dg/gomp/pr

Re: [PATCH v2 06/11] OpenMP: lvalue parsing for map clauses (C++)

2022-11-02 Thread Julian Brown
On Wed, 2 Nov 2022 12:58:37 +0100
Jakub Jelinek via Fortran  wrote:

> On Tue, Nov 01, 2022 at 09:50:38PM +, Julian Brown wrote:
> > > I think we should figure out when we should temporarily disable
> > >   parser->omp_array_section_p = false;
> > > and restore it afterwards to a saved value.  E.g.
> > > cp_parser_lambda_expression seems like a good candidate, the fact
> > > that OpenMP array sections are allowed say in map clause doesn't
> > > mean they are allowed inside of lambdas and it would be
> > > especially hard when the lambda is defining a separate function
> > > and the search for OMP_ARRAY_SECTION probably wouldn't be able to
> > > discover those. Other spots to consider might be statement
> > > expressions, perhaps type definitions etc.  
> > 
> > I've had a go at doing this -- several expression types now forbid
> > array-section syntax (see new "bad-array-section-*" tests added).
> > I'm afraid my C++ isn't quite up to figuring out how it's possible
> > to define a type inside an expression (inside a map clause) if we
> > forbid lambdas and statement expressions though -- can you give an
> > example?  
> 
> But we can't forbid lambdas inside of the map clause expressions,
> they are certainly valid in OpenMP, and IMNSHO shouldn't disallow
> statement expressions, people might not even know they use a
> statement expression, they could just use some standard macro which
> uses a statement expression under the hood.  Though your testcases
> look good.

I meant "forbid array sections within lambdas and statement
expressions" -- FAOD, does that seem reasonable? Technically it might
not be that hard to support e.g. a statement expression with an array
section on the final expression, but that doesn't work at the moment.
Maybe a follow-on patch could support that if we want it?

I'll take a look at addressing your other review comments, thanks!

Cheers,

Julian


Re: [PATCH v2 06/11] OpenMP: lvalue parsing for map clauses (C++)

2022-11-02 Thread Jakub Jelinek via Fortran
On Wed, Nov 02, 2022 at 12:20:11PM +, Julian Brown wrote:
> > But we can't forbid lambdas inside of the map clause expressions,
> > they are certainly valid in OpenMP, and IMNSHO shouldn't disallow
> > statement expressions, people might not even know they use a
> > statement expression, they could just use some standard macro which
> > uses a statement expression under the hood.  Though your testcases
> > look good.
> 
> I meant "forbid array sections within lambdas and statement
> expressions" -- FAOD, does that seem reasonable? Technically it might

Yeah, my response was to the wording you wrote above the patch, not what
is inside of the patch which looked ok.

> not be that hard to support e.g. a statement expression with an array
> section on the final expression, but that doesn't work at the moment.

And I think we want to keep it that way.

Jakub



[Patch] Fortran/OpenMP: Fix DT struct-component with 'alloc' and array descr

2022-11-02 Thread Tobias Burnus

This fixes some an issue with 'alloc:' found when working on the patch
'[Patch] OpenMP/Fortran: 'target update' with strides + DT components'
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604687.html
(BTW: This one is still pending review.)

OK for mainline?

 * * *

I think the patch is a great improvement.

However, again, by writing a testcase, more issues have been found:
* one generic Fortran one, worked around by adding '(:)',
  Cf. https://gcc.gnu.org/PR107508 "Invalid bounds due to bogus reallocation
  on assignment with KIND=4 characters".
* Some other string issues, some might be generic Fortran issues
* Some issue with pointers - where exit data give an error as
  0x00 and 0x01 kinds are not known by target exit data
  Those also showed up with the 'target update' patch mentioned above.

For the last two, I used '#if 0' followed by a comment with the current
error message. I do intent to look into those - or at least file a PR.
Likewise for the remaining issues mentioned in the 'tagret update' patch.

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran/OpenMP: Fix DT struct-component with 'alloc' and array descr

When using 'map(alloc: var, dt%comp)' needs to have a 'to' mapping of
the array descriptor as otherwise the bounds are not available in the
target region. - Likewise for character strings.

This patch implements this; however, some additional issues are exposed
by the testcase; those are '#if 0'ed and will be handled later.

gcc/fortran/ChangeLog:

	* trans-openmp.cc (gfc_trans_omp_clauses): Ensure DT struct-comp with
	array descriptor and 'alloc:' have the descriptor mapped with 'to:'.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/target-enter-data-3.f90: New test.

 gcc/fortran/trans-openmp.cc   |3 
 libgomp/testsuite/libgomp.fortran/target-enter-data-3.f90 |  567 ++
 2 files changed, 569 insertions(+), 1 deletion(-)

diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index 4bfdf85cd9b..4eb9d4c9edc 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -3507,7 +3507,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 			= gfc_full_array_size (block, inner, rank);
 			  tree elemsz
 			= TYPE_SIZE_UNIT (gfc_get_element_type (type));
-			  if (GOMP_MAP_COPY_TO_P (OMP_CLAUSE_MAP_KIND (node)))
+			  if (GOMP_MAP_COPY_TO_P (OMP_CLAUSE_MAP_KIND (node))
+			  || OMP_CLAUSE_MAP_KIND (node) == GOMP_MAP_ALLOC)
 			map_kind = GOMP_MAP_TO;
 			  else if (n->u.map_op == OMP_MAP_RELEASE
    || n->u.map_op == OMP_MAP_DELETE)
diff --git a/libgomp/testsuite/libgomp.fortran/target-enter-data-3.f90 b/libgomp/testsuite/libgomp.fortran/target-enter-data-3.f90
new file mode 100644
index 000..1fe3f03c7b8
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/target-enter-data-3.f90
@@ -0,0 +1,567 @@
+! { dg-additional-options "-cpp" }
+
+! FIXME: Some tests do not work yet. Those are for now in '#if 0'
+
+! Check that 'map(alloc:' properly works with
+! - deferred-length character strings
+! - arrays with array descriptors
+! For those, the array descriptor / string length must be mapped with 'to:'
+
+program main
+implicit none
+
+type t
+  integer :: ic(2:5), ic2
+  character(len=11) :: ccstr(3:4), ccstr2
+  character(len=11,kind=4) :: cc4str(3:7), cc4str2
+  integer, pointer :: pc(:), pc2
+  character(len=:), pointer :: pcstr(:), pcstr2
+  character(len=:,kind=4), pointer :: pc4str(:), pc4str2
+end type t
+
+type(t) :: dt
+
+integer :: ii(5), ii2
+character(len=11) :: clstr(-1:1), clstr2
+character(len=11,kind=4) :: cl4str(0:3), cl4str2
+integer, pointer :: ip(:), ip2
+integer, allocatable :: ia(:), ia2
+character(len=:), pointer :: pstr(:), pstr2
+character(len=:), allocatable :: astr(:), astr2
+character(len=:,kind=4), pointer :: p4str(:), p4str2
+character(len=:,kind=4), allocatable :: a4str(:), a4str2
+
+
+allocate(dt%pc(5), dt%pc2)
+allocate(character(len=2) :: dt%pcstr(2))
+allocate(character(len=4) :: dt%pcstr2)
+
+allocate(character(len=3,kind=4) :: dt%pc4str(2:3))
+allocate(character(len=5,kind=4) :: dt%pc4str2)
+
+allocate(ip(5), ip2, ia(8), ia2)
+allocate(character(len=2) :: pstr(-2:0))
+allocate(character(len=4) :: pstr2)
+allocate(character(len=6) :: astr(3:5))
+allocate(character(len=8) :: astr2)
+
+allocate(character(len=3,kind=4) :: p4str(2:4))
+allocate(character(len=5,kind=4) :: p4str2)
+allocate(character(len=7,kind=4) :: a4str(-2:3))
+allocate(character(len=9,kind=4) :: a4str2)
+
+
+! integer :: ic(2:5), ic2
+
+!$omp target enter data map(alloc: dt%ic)
+!$omp target map(alloc: dt%ic)
+  if (size(dt%ic) /= 4) error stop
+  if (lbound(dt%ic, 1) /= 2) error stop
+  if (ubound(dt%ic, 1) /= 5) error stop
+  dt%ic = [22,

Re: [PATCH, v2] Fortran: ordering of hidden procedure arguments [PR107441]

2022-11-02 Thread Mikael Morin

Le 31/10/2022 à 21:29, Harald Anlauf via Fortran a écrit :

Hi Mikael,

thanks a lot, your testcases broke my initial (and incorrect) patch
in multiple ways.  I understand now that the right solution is much
simpler and smaller.

I've added your testcases, see attached, with a simple scan of the
dump for the generated order of hidden arguments in the function decl
for the last testcase.

Regtested again on x86_64-pc-linux-gnu.  OK now?


Unfortunately no, the coarray case works, but the other problem remains.
The type problem is not visible in the definition of S, it is in the 
declaration of S's prototype in P.


S is defined as:

void s (character(kind=1)[1:_c] & restrict c, integer(kind=4) o, 
logical(kind=1) _o, integer(kind=8) _c)

{
...
}

but P has:

void p ()
{
  static void s (character(kind=1)[1:] & restrict, integer(kind=4), 
integer(kind=8), logical(kind=1));
  void (*) (character(kind=1)[1:] & restrict, integer(kind=4), 
integer(kind=8), logical(kind=1)) pp;


  pp = s;
...
}





Add 'libgomp.oacc-fortran/declare-allocatable-1.f90' (was: [gomp4] add support for fortran allocate support with declare create)

2022-11-02 Thread Thomas Schwinge
Hi!

On 2017-04-05T08:23:58-0700, Cesar Philippidis  wrote:
> This patch implements the OpenACC 2.5 behavior of fortran allocate on
> variables marked with declare create as defined in Section 2.13.2 in the
> OpenACC spec.

That functionality is still missing in GCC master branch, however a test
case included in that submission here:

> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
> @@ -0,0 +1,211 @@
> +! Test declare create with allocatable arrays.

... is useful in a different (though related) context that I'm currently
working on.  Having applied the following changes:

  - Replace 'call abort' by 'error stop' (in spirit of earlier PR84381
changes).
  - Replace '[logical] .neqv. .true.' by '.not.[logical]'.
  - Add scanning for OpenACC compiler diagnostics.
  - 'dg-xfail-run-if' for '-DACC_MEM_SHARED=0' (see above).

..., I've then pushed to master branch
commit 8c357d884b16cb3c14cba8a61be5b53fd04a6bfe
"Add 'libgomp.oacc-fortran/declare-allocatable-1.f90'", see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 8c357d884b16cb3c14cba8a61be5b53fd04a6bfe Mon Sep 17 00:00:00 2001
From: Cesar Philippidis 
Date: Wed, 5 Apr 2017 08:23:58 -0700
Subject: [PATCH] Add 'libgomp.oacc-fortran/declare-allocatable-1.f90'

	libgomp/
	* testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90: New.

Co-authored-by: Thomas Schwinge 
---
 .../declare-allocatable-1.f90 | 268 ++
 1 file changed, 268 insertions(+)
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90

diff --git a/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
new file mode 100644
index 000..1c8ccd9f61f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
@@ -0,0 +1,268 @@
+! Test OpenACC 'declare create' with allocatable arrays.
+
+! { dg-do run }
+
+!TODO-OpenACC-declare-allocate
+! Not currently implementing correct '-DACC_MEM_SHARED=0' behavior:
+! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
+! "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
+! { dg-xfail-run-if TODO { *-*-* } { -DACC_MEM_SHARED=0 } }
+
+!TODO { dg-additional-options -fno-inline } for stable results regarding OpenACC 'routine'.
+
+! { dg-additional-options -fopt-info-all-omp }
+! { dg-additional-options -foffload=-fopt-info-all-omp }
+
+! { dg-additional-options --param=openacc-privatization=noisy }
+! { dg-additional-options -foffload=--param=openacc-privatization=noisy }
+! Prune a few: uninteresting, and potentially varying depending on GCC configuration (data types):
+! { dg-prune-output {note: variable '[Di]\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} }
+
+! { dg-additional-options -Wopenacc-parallelism }
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c 0] }
+! { dg-message dummy {} { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
+
+
+module vars
+  implicit none
+  integer, parameter :: n = 100
+  real*8, allocatable :: b(:)
+ !$acc declare create (b)
+end module vars
+
+program test
+  use vars
+  use openacc
+  implicit none
+  real*8 :: a
+  integer :: i
+
+  interface
+ subroutine sub1
+   !$acc routine gang
+ end subroutine sub1
+
+ subroutine sub2
+ end subroutine sub2
+
+ real*8 function fun1 (ix)
+   integer ix
+   !$acc routine seq
+ end function fun1
+
+ real*8 function fun2 (ix)
+   integer ix
+   !$acc routine seq
+ end function fun2
+  end interface
+
+  if (allocated (b)) error stop
+
+  ! Test local usage of an allocated declared array.
+
+  allocate (b(n))
+
+  if (.not.allocated (b)) error stop
+  if (.not.acc_is_present (b)) error stop
+
+  a = 2.0
+
+  !$acc parallel loop ! { dg-line l[incr c] }
+  ! { dg-note {variable 'i' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l$c }
+  !   { dg-note {variable 'i' ought to be adjusted for OpenACC privatization level: 'vector'} {} { target *-*-* } l$c }
+  !   { dg-note {variable 'i' adjusted for OpenACC privatization level: 'vector'} {} { target { ! openacc_host_selected } } l$c }
+  ! { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l$c }
+  ! { dg-optimized {as

Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90' (was: Add 'libgomp.oacc-fortran/declare-allocatable-1.f90')

2022-11-02 Thread Thomas Schwinge
Hi!

On 2022-11-02T21:04:56+0100, I wrote:
> On 2017-04-05T08:23:58-0700, Cesar Philippidis  wrote:
>> This patch implements the OpenACC 2.5 behavior of fortran allocate on
>> variables marked with declare create as defined in Section 2.13.2 in the
>> OpenACC spec.
>
> That functionality is still missing in GCC master branch, however a test
> case included in that submission here:
>
>> --- /dev/null
>> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
>> @@ -0,0 +1,211 @@
>> +! Test declare create with allocatable arrays.
>
> ... is useful in a different (though related) context that I'm currently
> working on.  Having applied the following changes:
>
>   - Replace 'call abort' by 'error stop' (in spirit of earlier PR84381
> changes).
>   - Replace '[logical] .neqv. .true.' by '.not.[logical]'.
>   - Add scanning for OpenACC compiler diagnostics.
>   - 'dg-xfail-run-if' for '-DACC_MEM_SHARED=0' (see above).
>
> ..., I've then pushed to master branch
> commit 8c357d884b16cb3c14cba8a61be5b53fd04a6bfe
> "Add 'libgomp.oacc-fortran/declare-allocatable-1.f90'", see attached.

> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
> @@ -0,0 +1,268 @@
> +! Test OpenACC 'declare create' with allocatable arrays.
> +
> +! { dg-do run }
> +
> +!TODO-OpenACC-declare-allocate
> +! Not currently implementing correct '-DACC_MEM_SHARED=0' behavior:
> +! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
> +! "The 'declare create' directive with a Fortran 'allocatable' has new 
> behavior".
> +! { dg-xfail-run-if TODO { *-*-* } { -DACC_MEM_SHARED=0 } }
> +
> +[...]

Getting rid of the "'dg-xfail-run-if' for '-DACC_MEM_SHARED=0'" via a
work around (as seen in real-world code), I've pushed to master branch
commit 59c6c5dbf267cd9d0a8df72b2a5eb5657b64268e
"Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90'", see
attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 59c6c5dbf267cd9d0a8df72b2a5eb5657b64268e Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 14 Oct 2022 17:36:51 +0200
Subject: [PATCH] Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90'

... which is 'libgomp.oacc-fortran/declare-allocatable-1.f90' adjusted
for missing support for OpenACC "Changes from Version 2.0 to 2.5":
"The 'declare create' directive with a Fortran 'allocatable' has new behavior".
Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
manually.

	libgomp/
	* testsuite/libgomp.oacc-fortran/declare-allocatable-1-runtime.f90:
	New.
---
 ...ble-1.f90 => declare-allocatable-1-runtime.f90} | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)
 copy libgomp/testsuite/libgomp.oacc-fortran/{declare-allocatable-1.f90 => declare-allocatable-1-runtime.f90} (96%)

diff --git a/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1-runtime.f90
similarity index 96%
copy from libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
copy to libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1-runtime.f90
index 1c8ccd9f61f..e4cb9c378a3 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1-runtime.f90
@@ -3,10 +3,10 @@
 ! { dg-do run }
 
 !TODO-OpenACC-declare-allocate
-! Not currently implementing correct '-DACC_MEM_SHARED=0' behavior:
 ! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
 ! "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
-! { dg-xfail-run-if TODO { *-*-* } { -DACC_MEM_SHARED=0 } }
+! Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
+! manually.
 
 !TODO { dg-additional-options -fno-inline } for stable results regarding OpenACC 'routine'.
 
@@ -67,6 +67,7 @@ program test
   ! Test local usage of an allocated declared array.
 
   allocate (b(n))
+  call acc_create (b)
 
   if (.not.allocated (b)) error stop
   if (.not.acc_is_present (b)) error stop
@@ -91,12 +92,14 @@ program test
  if (b(i) /= i*a) error stop
   end do
 
+  call acc_delete (b)
   deallocate (b)
 
   ! Test the usage of an allocated declared array inside an acc
   ! routine subroutine.
 
   allocate (b(n))
+  call acc_create (b)
 
   if (.not.allocated (b)) error stop
   if (.not.acc_is_present (b)) error stop
@@ -114,6 +117,7 @@ program test
  if (b(i) /= i*2) error stop
   end do
 
+  call acc_delete (b)
   deallocate (b)
 
   ! Test the usage of an allocated declared array inside a host
@@ -129,6 +133,7 @@ program test
  if (b(i) /= 1.0) error stop
   end do
 
+  call acc_delete (b)
   deallocate (b)
 
   if (allocated (b)) error stop
@@ -137,6 +142

Add 'libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90'

2022-11-02 Thread Thomas Schwinge
Hi!

On 2022-11-02T21:10:54+0100, I wrote:
> On 2022-11-02T21:04:56+0100, I wrote:
>> On 2017-04-05T08:23:58-0700, Cesar Philippidis  
>> wrote:
>>> This patch implements the OpenACC 2.5 behavior of fortran allocate on
>>> variables marked with declare create as defined in Section 2.13.2 in the
>>> OpenACC spec.
>>
>> That functionality is still missing in GCC master branch, however a test
>> case included in that submission here:
>>
>>> --- /dev/null
>>> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
>>> @@ -0,0 +1,211 @@
>>> +! Test declare create with allocatable arrays.
>>
>> ... is useful in a different (though related) context that I'm currently
>> working on.  Having applied the following changes:
>>
>>   - Replace 'call abort' by 'error stop' (in spirit of earlier PR84381
>> changes).
>>   - Replace '[logical] .neqv. .true.' by '.not.[logical]'.
>>   - Add scanning for OpenACC compiler diagnostics.
>>   - 'dg-xfail-run-if' for '-DACC_MEM_SHARED=0' (see above).
>>
>> ..., I've then pushed to master branch
>> commit 8c357d884b16cb3c14cba8a61be5b53fd04a6bfe
>> "Add 'libgomp.oacc-fortran/declare-allocatable-1.f90'", see attached.
>
>> --- /dev/null
>> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
>> @@ -0,0 +1,268 @@
>> +! Test OpenACC 'declare create' with allocatable arrays.
>> +
>> +! { dg-do run }
>> +
>> +!TODO-OpenACC-declare-allocate
>> +! Not currently implementing correct '-DACC_MEM_SHARED=0' behavior:
>> +! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
>> +! "The 'declare create' directive with a Fortran 'allocatable' has new 
>> behavior".
>> +! { dg-xfail-run-if TODO { *-*-* } { -DACC_MEM_SHARED=0 } }
>> +
>> +[...]
>
> Getting rid of the "'dg-xfail-run-if' for '-DACC_MEM_SHARED=0'" via a
> work around (as seen in real-world code), I've pushed to master branch
> commit 59c6c5dbf267cd9d0a8df72b2a5eb5657b64268e
> "Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90'"

> ... which is 'libgomp.oacc-fortran/declare-allocatable-1.f90' adjusted
> for missing support for OpenACC "Changes from Version 2.0 to 2.5":
> "The 'declare create' directive with a Fortran 'allocatable' has new 
> behavior".
> Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
> manually.

A similar test case, but with different focus, I've pushed to master
branch in commit abeaf3735fe2568b9d5b8096318da866b1fe1e5c
"Add 'libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90'",
see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From abeaf3735fe2568b9d5b8096318da866b1fe1e5c Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Wed, 26 Oct 2022 23:47:29 +0200
Subject: [PATCH] Add
 'libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90'

	libgomp/
	* testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90:
	New.
---
 ...allocatable-array_descriptor-1-runtime.f90 | 402 ++
 1 file changed, 402 insertions(+)
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90

diff --git a/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90 b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90
new file mode 100644
index 000..b27f312631d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90
@@ -0,0 +1,402 @@
+! Test OpenACC 'declare create' with allocatable arrays.
+
+! { dg-do run }
+
+! Note that we're not testing OpenACC semantics here, but rather documenting
+! current GCC behavior, specifically, behavior concerning updating of
+! host/device array descriptors.
+! { dg-skip-if n/a { *-*-* } { -DACC_MEM_SHARED=1 } }
+
+!TODO-OpenACC-declare-allocate
+! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
+! "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
+! Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
+! manually.
+
+
+!TODO { dg-additional-options -fno-inline } for stable results regarding OpenACC 'routine'.
+
+
+!TODO OpenACC 'serial' vs. GCC/nvptx:
+!TODO { dg-prune-output {using 'vector_length \(32\)', ignoring 1} }
+
+
+! { dg-additional-options -fdump-tree-original }
+! { dg-additional-options -fdump-tree-gimple }
+
+
+module vars
+  implicit none
+  integer, parameter :: n1_lb = -3
+  integer, parameter :: n1_ub = 6
+  integer, parameter :: n2_lb = -
+  integer, parameter :: n2_ub = 2
+
+  integer, allocatable :: b(:)
+  !$acc declare create (b)
+
+end module vars
+
+program test
+  use vars
+  use openacc
+  implicit none
+  integer :: i
+
+  ! Identif

Support OpenACC 'declare create' with Fortran allocatable arrays, part I [PR106643]

2022-11-02 Thread Thomas Schwinge
Hi!

On 2022-11-02T21:15:31+0100, I wrote:
> On 2022-11-02T21:10:54+0100, I wrote:
>> On 2022-11-02T21:04:56+0100, I wrote:
>>> --- /dev/null
>>> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
>>> @@ -0,0 +1,268 @@
>>> +! Test OpenACC 'declare create' with allocatable arrays.
>>> +
>>> +! { dg-do run }
>>> +
>>> +!TODO-OpenACC-declare-allocate
>>> +! Not currently implementing correct '-DACC_MEM_SHARED=0' behavior:
>>> +! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
>>> +! "The 'declare create' directive with a Fortran 'allocatable' has new 
>>> behavior".
>>> +! { dg-xfail-run-if TODO { *-*-* } { -DACC_MEM_SHARED=0 } }
>>> +
>>> +[...]
>>
>> Getting rid of the "'dg-xfail-run-if' for '-DACC_MEM_SHARED=0'" via a
>> work around (as seen in real-world code), I've pushed to master branch
>> commit 59c6c5dbf267cd9d0a8df72b2a5eb5657b64268e
>> "Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90'"
>
>> ... which is 'libgomp.oacc-fortran/declare-allocatable-1.f90' adjusted
>> for missing support for OpenACC "Changes from Version 2.0 to 2.5":
>> "The 'declare create' directive with a Fortran 'allocatable' has new 
>> behavior".
>> Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
>> manually.
>
> A similar test case, but with different focus, I've pushed to master
> branch in commit abeaf3735fe2568b9d5b8096318da866b1fe1e5c
> "Add 
> 'libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90'",
> see attached.

> --- /dev/null
> +++ 
> b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90
> @@ -0,0 +1,402 @@
> +! Test OpenACC 'declare create' with allocatable arrays.
> +
> +! { dg-do run }
> +
> +! Note that we're not testing OpenACC semantics here, but rather documenting
> +! current GCC behavior, specifically, behavior concerning updating of
> +! host/device array descriptors.
> +! { dg-skip-if n/a { *-*-* } { -DACC_MEM_SHARED=1 } }
> +
> +!TODO-OpenACC-declare-allocate
> +! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
> +! "The 'declare create' directive with a Fortran 'allocatable' has new 
> behavior".
> +! Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
> +! manually.

If instead of calling 'acc_create'/'acc_delete' we'd like to use
'!$acc enter data create'/'!$acc exit data delete', we run into

"[gfortran + OpenACC] Allocate in module causes refcount error".
Pushed to master branchcommit da8e0e1191c5512244a752b30dea0eba83e3d10c
"Support OpenACC 'declare create' with Fortran allocatable arrays, part I 
[PR106643]",
see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From da8e0e1191c5512244a752b30dea0eba83e3d10c Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Thu, 27 Oct 2022 21:52:07 +0200
Subject: [PATCH] Support OpenACC 'declare create' with Fortran allocatable
 arrays, part I [PR106643]

	PR libgomp/106643
	libgomp/
	* oacc-mem.c (goacc_enter_data_internal): Support
	OpenACC 'declare create' with Fortran allocatable arrays, part I.
	* testsuite/libgomp.oacc-fortran/declare-allocatable-1-directive.f90:
	New.
	* testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90:
	New.
---
 libgomp/oacc-mem.c| 28 +--
 ...90 => declare-allocatable-1-directive.f90} | 14 --
 ...ocatable-array_descriptor-1-directive.f90} | 12 
 3 files changed, 44 insertions(+), 10 deletions(-)
 copy libgomp/testsuite/libgomp.oacc-fortran/{declare-allocatable-1.f90 => declare-allocatable-1-directive.f90} (95%)
 copy libgomp/testsuite/libgomp.oacc-fortran/{declare-allocatable-array_descriptor-1-runtime.f90 => declare-allocatable-array_descriptor-1-directive.f90} (98%)

diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 73b2710c2b8..ba010fddbb3 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -1150,8 +1150,7 @@ goacc_enter_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
 	}
   else if (n && groupnum > 1)
 	{
-	  assert (n->refcount != REFCOUNT_INFINITY
-		  && n->refcount != REFCOUNT_LINK);
+	  assert (n->refcount != REFCOUNT_LINK);
 
 	  for (size_t j = i + 1; j <= group_last; j++)
 	if ((kinds[j] & 0xff) == GOMP_MAP_ATTACH)
@@ -1166,6 +1165,31 @@ goacc_enter_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
 	  bool processed = false;
 
 	  struct target_mem_desc *tgt = n->tgt;
+
+	  /* Arrange so that OpenACC 'declare' code à la PR106643
+	 "[gfortran + OpenACC] Allocate in module causes refcount error"
+	 has a chance to work.  */
+	  if ((kinds[i] & 0xff) == GOMP_MAP_TO_PSET
+	  && tgt->list_count == 0)
+	{
+	  /* 'declare target'

Support OpenACC 'declare create' with Fortran allocatable arrays, part II [PR106643, PR96668] (was: Support OpenACC 'declare create' with Fortran allocatable arrays, part I [PR106643])

2022-11-02 Thread Thomas Schwinge
Hi!

On 2022-11-02T21:22:25+0100, I wrote:
> On 2022-11-02T21:15:31+0100, I wrote:
>> On 2022-11-02T21:10:54+0100, I wrote:
>>> On 2022-11-02T21:04:56+0100, I wrote:
 --- /dev/null
 +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
 @@ -0,0 +1,268 @@
 +! Test OpenACC 'declare create' with allocatable arrays.
 +
 +! { dg-do run }
 +
 +!TODO-OpenACC-declare-allocate
 +! Not currently implementing correct '-DACC_MEM_SHARED=0' behavior:
 +! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
 +! "The 'declare create' directive with a Fortran 'allocatable' has new 
 behavior".
 +! { dg-xfail-run-if TODO { *-*-* } { -DACC_MEM_SHARED=0 } }
 +
 +[...]
>>>
>>> Getting rid of the "'dg-xfail-run-if' for '-DACC_MEM_SHARED=0'" via a
>>> work around (as seen in real-world code), I've pushed to master branch
>>> commit 59c6c5dbf267cd9d0a8df72b2a5eb5657b64268e
>>> "Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90'"
>>
>>> ... which is 'libgomp.oacc-fortran/declare-allocatable-1.f90' adjusted
>>> for missing support for OpenACC "Changes from Version 2.0 to 2.5":
>>> "The 'declare create' directive with a Fortran 'allocatable' has new 
>>> behavior".
>>> Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
>>> manually.
>>
>> A similar test case, but with different focus, I've pushed to master
>> branch in commit abeaf3735fe2568b9d5b8096318da866b1fe1e5c
>> "Add 
>> 'libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90'",
>> see attached.
>
>> --- /dev/null
>> +++ 
>> b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90
>> @@ -0,0 +1,402 @@
>> +! Test OpenACC 'declare create' with allocatable arrays.
>> +
>> +! { dg-do run }
>> +
>> +! Note that we're not testing OpenACC semantics here, but rather documenting
>> +! current GCC behavior, specifically, behavior concerning updating of
>> +! host/device array descriptors.
>> +! { dg-skip-if n/a { *-*-* } { -DACC_MEM_SHARED=1 } }
>> +
>> +!TODO-OpenACC-declare-allocate
>> +! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
>> +! "The 'declare create' directive with a Fortran 'allocatable' has new 
>> behavior".
>> +! Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
>> +! manually.
>
> If instead of calling 'acc_create'/'acc_delete' we'd like to use
> '!$acc enter data create'/'!$acc exit data delete', we run into
> 
> "[gfortran + OpenACC] Allocate in module causes refcount error".
> Pushed to master branchcommit da8e0e1191c5512244a752b30dea0eba83e3d10c
> "Support OpenACC 'declare create' with Fortran allocatable arrays, part I 
> [PR106643]",
> see attached.

> --- a/libgomp/oacc-mem.c
> +++ b/libgomp/oacc-mem.c

> @@ -1166,6 +1165,31 @@ goacc_enter_data_internal (struct gomp_device_descr 
> *acc_dev, size_t mapnum,
> bool processed = false;
>
> struct target_mem_desc *tgt = n->tgt;
> +
> +   /* Arrange so that OpenACC 'declare' code à la PR106643
> +  "[gfortran + OpenACC] Allocate in module causes refcount error"
> +  has a chance to work.  */
> +   if ((kinds[i] & 0xff) == GOMP_MAP_TO_PSET
> +   && tgt->list_count == 0)
> + {
> +   /* 'declare target'.  */
> +   assert (n->refcount == REFCOUNT_INFINITY);
> +
> +   for (size_t k = 1; k < groupnum; k++)
> + {
> +   /* The only thing we expect to see here.  */
> +   assert ((kinds[i + k] & 0xff) == GOMP_MAP_POINTER);
> + }
> +
> +   /* Given that 'goacc_exit_data_internal'/'goacc_exit_datum_1'
> +  will always see 'n->refcount == REFCOUNT_INFINITY',
> +  there's no need to adjust 'n->dynamic_refcount' here.  */
> +
> +   processed = true;
> + }

To make slightly more interesting (real-world) test cases work, we here
also have to process the 'GOMP_MAP_TO_PSET', 'GOMP_MAP_POINTER' here.
Tobias had implemented such a thing in context of OpenMP PR96668
"[OpenMP] Re-mapping allocated but previously unallocated allocatable does not 
work"
a while ago, and we may do similar here.  Side note: in the first version
of my changes, I had actually here in
'libgomp/oacc-mem.c:goacc_enter_data_internal' re-implemented the
corresponding -- "somewhat ugly" -- logic, when at some point I realized
that I instead could simply call into the existing code, greatly reducing
the complexity here...  Pushed to master branch
commit f6ce1e77bbf5d3a096f52e674bfd7354c6537d10
"Support OpenACC 'declare create' with Fortran allocatable arrays, part II 
[PR106643, PR96668]",
see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergeric

Re: [PATCH, v2] Fortran: ordering of hidden procedure arguments [PR107441]

2022-11-02 Thread Harald Anlauf via Fortran

Am 02.11.22 um 18:20 schrieb Mikael Morin:

Unfortunately no, the coarray case works, but the other problem remains.
The type problem is not visible in the definition of S, it is in the 
declaration of S's prototype in P.


S is defined as:

void s (character(kind=1)[1:_c] & restrict c, integer(kind=4) o, 
logical(kind=1) _o, integer(kind=8) _c)

{
...
}

but P has:

void p ()
{
   static void s (character(kind=1)[1:] & restrict, integer(kind=4), 
integer(kind=8), logical(kind=1));
   void (*) (character(kind=1)[1:] & restrict, integer(kind=4), 
integer(kind=8), logical(kind=1)) pp;


   pp = s;
...
}


Right, now I see it too.  Simplified case:

program p
  call s ("abcd")
contains
  subroutine s(c, o)
character(*) :: c
integer, optional, value :: o
  end subroutine s
end

I do see what needs to be done in gfc_get_function_type, which seems
in fact very simple.  But I get really lost in create_function_arglist
when trying to get the typelist right.

One thing is I really don't understand how the (hidden_)typelist is
managed here.  How does that macro TREE_CHAIN work?  Can we somehow
chain two typelists the same way we chain arguments?

(Failing that, I tried to split the loop over the dummy arguments in
create_function_arglist into two passes, one for the optional+value
variant, and one for the rest.  It turned out to be a bad idea...)

Harald




Re: adding attributes

2022-11-02 Thread Bernhard Reutner-Fischer via Fortran
On Mon, 31 Oct 2022 21:19:18 +
Dave Love via Fortran  wrote:

> Bernhard Reutner-Fischer via Fortran  writes:

> > Ideally the syntax would be the same as in C.  
> 
> Right.  I hoped it would be possible to lift machinery easily from C.

Lifting that won't work easily, no.

> There's no standard method for this sort of portable performance
> engineering as far as I can tell.  The best I could see was specifying a
> SIMD length statically in OpenMP.  I'm interested in things that
> potentially make the difference between, say, vectorization for AVX2 or
> full-width AVX512 versus SSE2 for profiled host-spots.  I fully agree

I see.
So target_clones is one thing. What other attributes would be important?

> about measurement and not doing things blindly, and I prize
> maintainability.  However, target_clones is clearly better than the
> existing facility for explicit, target-independent unrolling, for instance.

Yes. Unroll is certainly only applicable in a few places, sure.
> 
> > In former times, you would compile your library multiple times
> > and provide a distinct, optimized version for each of the CPUs.
> > Maybe that would work for you equally well, without target_clones?  
> 
> "Former times" to me means, say, GEC 4000 v. IBM 370 and the aftermath
> of "all the world's a VAX", rather than different x86
> micro-architectures...  I do now work on both x86_64 and POWER.

In your job script you would use cpuid(1) to determine a properly tuned
binary for the parts of the cluster you run on. Or the installed
binaries are tuned for the host they are installed on and are located in
a uniform place per application.

> 
> Multiple compilation isn't a good solution.  I haven't followed the

It might not be good, but it's cheap and easy if you only have a small
set of different arches and subarches each. In a controlled
environment, with a batch scheduler. Won't work in the wild of course.

> current state of hardware capability support, but relevant systems don't
> have it on x86_64, at least.  That wouldn't help kernels of your
> simulation code that aren't abstracted into a library or set up for
> dynamic dispatch anyway.  I don't have a specific instance in mind, but
> consider OS packaging, which I do; that currently has to be built for
> base x86_64 (SSE2) for EPEL, at least, and so could miss a factor of
> several performance from vectorized.

For packaging for global use that won't work all that well indeed.

But since you cannot mix target_clones across arch-boundaries,
supporting those for a distro will probably be rather ugly anyway.
I think that's what's gentoo et al are for, or your privately rebuilt
debian repo; provide a tuned world for everybody, individually ;) But as
you mentioned EPEL i never said that :)

> 
> > HTH  
> 
> Thanks.  Definitely a more helpful response than when I asked about
> doing something previously!  (I don't know if I'll actually be able to
> work on it in the end, at least on work time.)

heh, me neither. Luckily yesterday was a holiday, so what i ended up
with was the following, fya.
Consider:
$ grep -v "^\!\!" 
/scratch/src/gcc-13.mine/gcc/testsuite/gfortran.dg/attr_target_clones-1.f90;echo
 EOF
! { dg-do compile }
! { dg-options "-O1 -fdump-tree-optimized" }
!
! Test __attribute__ ((target_clones ("foo", "bar")))
!
module m
  implicit none
contains
  subroutine sub1()
!GCC$ ATTRIBUTES target_clones("avx", "sse","default") :: sub1
print *, 4321
  end
end module m
! { dg-final { scan-tree-dump-times {void * __m_MOD_sub1.resolver ()} 
"optimized" 1 } }
! { dg-final { scan-tree-dump-times {void __m_MOD_sub1.avx ()} "optimized" 1 } }
! { dg-final { scan-tree-dump-times {void __m_MOD_sub1.sse ()} "optimized" 1 } }
!!! { dg-final { scan-tree-dump-times {XXX something sub1.default ()} 
"optimized" 1 } }
! { dg-final { scan-tree-dump-not {void sub1 ()} "optimized" } }
EOF
Which gives
$ ./gfortran -B. -o /tmp/out.o -c 
/scratch/src/gcc-13.mine/gcc/testsuite/gfortran.dg/attr_target_clones-1.f90 -O2 
-fdump-tree-original -fdump-tree-optimized
/tmp/ccxpGd9Y.s: Assembler messages:
/tmp/ccxpGd9Y.s:118: Error: symbol `__m_MOD_sub1' is already defined

That's because that ends up as
$ nl -ba /tmp/out.s | grep __m_MOD_sub1
12  .type   __m_MOD_sub1, @function
13  __m_MOD_sub1:
35  .size   __m_MOD_sub1, .-__m_MOD_sub1
36  .type   __m_MOD_sub1.avx, @function
37  __m_MOD_sub1.avx:
59  .size   __m_MOD_sub1.avx, .-__m_MOD_sub1.avx
60  .type   __m_MOD_sub1.sse, @function
61  __m_MOD_sub1.sse:
83  .size   __m_MOD_sub1.sse, .-__m_MOD_sub1.sse
84  .section
.text.__m_MOD_sub1.resolver,"axG",@progbits,__m_MOD_sub1.resolver,comdat
85  .weak   __m_MOD_sub1.resolver
86  .type   __m_MOD_sub1.resolver, @function
87  __m_MOD_sub1.resolver:
95  movl$__m_MOD_sub1.avx, %eax
   104  movl$__m_MOD_sub1, %eax
   105  movl$__m_MOD_su

Finalization

2022-11-02 Thread Jerry D via Fortran

Hi Paul,

Long time no chat.  I hope you and yours are well.

I was planning to retire early next year, but with the economy going 
south on us I am going to hold off a bit.  What a crazy world we are in!


I thought I would drop you a note when I noticed a gfortran finalization 
bug 107489.  Then Steve Kargl mentioned the previous patch you had done 
a lot of work on.  I don't remember if that ever went into trunk or was 
pending some interpretation.


I tried to apply the patch I found and of course it is broken here and 
there.  I don't know if you have been keeping it alive or not relative 
to latest trunk.  So, I thought I would ask privately and at the same 
time see how you are doing?


Cheers always Paul,

Jerry


Re: Finalization

2022-11-02 Thread Damian Rouson
Thanks for asking this, Jerry!

Sourcery Institute has a (small) amount of funding that can be offered for
this work in case that helps.

Damian

On Wed, Nov 2, 2022 at 6:32 PM Jerry D via Fortran 
wrote:

> Hi Paul,
>
> Long time no chat.  I hope you and yours are well.
>
> I was planning to retire early next year, but with the economy going
> south on us I am going to hold off a bit.  What a crazy world we are in!
>
> I thought I would drop you a note when I noticed a gfortran finalization
> bug 107489.  Then Steve Kargl mentioned the previous patch you had done
> a lot of work on.  I don't remember if that ever went into trunk or was
> pending some interpretation.
>
> I tried to apply the patch I found and of course it is broken here and
> there.  I don't know if you have been keeping it alive or not relative
> to latest trunk.  So, I thought I would ask privately and at the same
> time see how you are doing?
>
> Cheers always Paul,
>
> Jerry
>