Re: [PATCH v6] libgfortran: Replace mutex with rwlock
On Fri, Aug 18, 2023 at 11:18:19AM +0800, Zhu, Lipeng wrote: > From: Lipeng Zhu > > This patch try to introduce the rwlock and split the read/write to > unit_root tree and unit_cache with rwlock instead of the mutex to > increase CPU efficiency. In the get_gfc_unit function, the percentage > to step into the insert_unit function is around 30%, in most instances, > we can get the unit in the phase of reading the unit_cache or unit_root > tree. So split the read/write phase by rwlock would be an approach to > make it more parallel. > > BTW, the IPC metrics can gain around 9x in our test > server with 220 cores. The benchmark we used is > https://github.com/rwesson/NEAT > > libgcc/ChangeLog: > > * gthr-posix.h (__GTHREAD_RWLOCK_INIT): New macro > (__gthrw): New function > (__gthread_rwlock_rdlock): New function > (__gthread_rwlock_tryrdlock): New function > (__gthread_rwlock_wrlock): New function > (__gthread_rwlock_trywrlock): New function > (__gthread_rwlock_unlock): New function > > libgfortran/ChangeLog: > > * io/async.c (DEBUG_LINE): New > * io/async.h (RWLOCK_DEBUG_ADD): New macro > (CHECK_RDLOCK): New macro > (CHECK_WRLOCK): New macro > (TAIL_RWLOCK_DEBUG_QUEUE): New macro > (IN_RWLOCK_DEBUG_QUEUE): New macro > (RDLOCK): New macro > (WRLOCK): New macro > (RWUNLOCK): New macro > (RD_TO_WRLOCK): New macro > (INTERN_RDLOCK): New macro > (INTERN_WRLOCK): New macro > (INTERN_RWUNLOCK): New macro > * io/io.h (internal_proto): Define unit_rwlock > * io/transfer.c (st_read_done_worker): Relace unit_lock with unit_rwlock > (st_write_done_worker): Relace unit_lock with unit_rwlock > * io/unit.c (get_gfc_unit): Relace unit_lock with unit_rwlock > (if): Relace unit_lock with unit_rwlock > (close_unit_1): Relace unit_lock with unit_rwlock > (close_units): Relace unit_lock with unit_rwlock > (newunit_alloc): Relace unit_lock with unit_rwlock > * io/unix.c (flush_all_units): Relace unit_lock with unit_rwlock The changeLog entries are all incorrect. 1) they should be indented by a tab, not 4 spaces, and should end with a dot 2) when several consecutive descriptions have the same text, especially when it is long, it should use Likewise. for the 2nd and following 3) (internal_proto) is certainly not what you've changed, from what I can see in io.h you've done: * io/io.h (struct gfc_unit): Change UNIT_LOCK to UNIT_RWLOCK in a comment. (unit_lock): Remove including associated internal_proto. (unit_rwlock): New declarations including associated internal_proto. (dec_waiting_unlocked): Use WRLOCK and RWUNLOCK on unit_rwlock instead of __gthread_mutex_lock and __gthread_mutex_unlock on unit_lock. (if) is certainly not what you've changed either, always find what function or macro the change was in, or if you remove something, state it, if you add something, state it. 4) all the Replace unit_lock with unit_rwlock. descriptions only partially match reality, you've also changed the operations on those variables. > --- a/libgfortran/io/async.h > +++ b/libgfortran/io/async.h > @@ -207,9 +207,132 @@ > INTERN_LOCK (&debug_queue_lock); \ > MUTEX_DEBUG_ADD (mutex); \ > INTERN_UNLOCK (&debug_queue_lock); > \ > -DEBUG_PRINTF ("%s" DEBUG_RED "ACQ:" DEBUG_NORM " %-30s %78p\n", > aio_prefix, #mutex, mutex); \ > +DEBUG_PRINTF ("%s" DEBUG_RED "ACQ:" DEBUG_NORM " %-30s %78p\n", > aio_prefix, #mutex, \ > + mutex); \ Why are you changing this at all? > +#define RD_TO_WRLOCK(rwlock) \ > + RWUNLOCK (rwlock);\ At least a space before \ (or better tab > +#define RD_TO_WRLOCK(rwlock) \ > + RWUNLOCK (rwlock);\ Likewise. > + WRLOCK (rwlock); > +#endif > +#endif > + > +#ifndef __GTHREAD_RWLOCK_INIT > +#define RDLOCK(rwlock) LOCK (rwlock) > +#define WRLOCK(rwlock) LOCK (rwlock) > +#define RWUNLOCK(rwlock) UNLOCK (rwlock) > +#define RD_TO_WRLOCK(rwlock) {} do {} while (0) instead of {} ? > #endif > > #define INTERN_LOCK(mutex) T_ERROR (__gthread_mutex_lock, mutex); > > #define INTERN_UNLOCK(mutex) T_ERROR (__gthread_mutex_unlock, mutex); > > +#define INTERN_RDLOCK(rwlock) T_ERROR (__gthread_rwlock_rdlock, rwlock); > +#define INTERN_WRLOCK(rwlock) T_ERROR (__gthread_rwlock_wrlock, rwlock); > +#define INTERN_RWUNLOCK(rwlock) T_ERROR (__gthread_rwlock_unlock, rwlock); Admittedly preexisting issue, but I wonder why the ; at the end, all the uses of RDLOCK etc. I've seen were followed by ; > --- a/libgfortran/io/unit.c > +++ b/libgfortran/io/unit.c > @@ -33,34 +33,36 @@ see the files COPYING3 and COPYING.RUNTIME respectively. > If not, see > > > /* IO locking rules: > - UNIT_LOCK is a master lock, protecting UNIT_ROOT tree and UNIT_CACHE. > + UNIT_RWLOCK is a master lock, protect
[PATCH] Fortran: allow NULL() for POINTER, OPTIONAL, CONTIGUOUS dummy [PR111503]
Dear all, here's another fix for the CONTIGUOUS attribute: NULL() should derive its characteristics from its MOLD argument; otherwise it is "determined by the entity with which the reference is associated". (F2018:16.9.144). The testcase is cross-checked with Intel. NAG rejects cases where MOLD is a pointer. I think it is wrong here. Regtested on x86_64-pc-linux-gnu. OK for mainline? Thanks, Harald From c73b248ec16388ed1ce109fce8a468a87e367085 Mon Sep 17 00:00:00 2001 From: Harald Anlauf Date: Fri, 8 Dec 2023 11:11:08 +0100 Subject: [PATCH] Fortran: allow NULL() for POINTER, OPTIONAL, CONTIGUOUS dummy [PR111503] gcc/fortran/ChangeLog: PR fortran/111503 * expr.cc (gfc_is_simply_contiguous): Determine characteristics of NULL() from MOLD argument if present, otherwise treat as present. * primary.cc (gfc_variable_attr): Derive attributes of NULL(MOLD) from MOLD. gcc/testsuite/ChangeLog: PR fortran/111503 * gfortran.dg/contiguous_14.f90: New test. --- gcc/fortran/expr.cc | 14 gcc/fortran/primary.cc | 4 ++- gcc/testsuite/gfortran.dg/contiguous_14.f90 | 39 + 3 files changed, 56 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gfortran.dg/contiguous_14.f90 diff --git a/gcc/fortran/expr.cc b/gcc/fortran/expr.cc index c668baeef8c..709f3c3cbef 100644 --- a/gcc/fortran/expr.cc +++ b/gcc/fortran/expr.cc @@ -5958,6 +5958,20 @@ gfc_is_simply_contiguous (gfc_expr *expr, bool strict, bool permit_element) if (expr->expr_type == EXPR_ARRAY) return true; + if (expr->expr_type == EXPR_NULL) +{ + /* F2018:16.9.144 NULL ([MOLD]): + "If MOLD is present, the characteristics are the same as MOLD." + "If MOLD is absent, the characteristics of the result are + determined by the entity with which the reference is associated." + F2018:15.3.2.2 characteristics attributes include CONTIGUOUS. */ + if (expr->ts.type == BT_UNKNOWN) + return true; + else + return (gfc_variable_attr (expr, NULL).contiguous + || gfc_variable_attr (expr, NULL).allocatable); +} + if (expr->expr_type == EXPR_FUNCTION) { if (expr->value.function.isym) diff --git a/gcc/fortran/primary.cc b/gcc/fortran/primary.cc index 7278932b634..f8a1c09d190 100644 --- a/gcc/fortran/primary.cc +++ b/gcc/fortran/primary.cc @@ -2627,7 +2627,9 @@ gfc_variable_attr (gfc_expr *expr, gfc_typespec *ts) gfc_component *comp; bool has_inquiry_part; - if (expr->expr_type != EXPR_VARIABLE && expr->expr_type != EXPR_FUNCTION) + if (expr->expr_type != EXPR_VARIABLE + && expr->expr_type != EXPR_FUNCTION + && !(expr->expr_type == EXPR_NULL && expr->ts.type != BT_UNKNOWN)) gfc_internal_error ("gfc_variable_attr(): Expression isn't a variable"); sym = expr->symtree->n.sym; diff --git a/gcc/testsuite/gfortran.dg/contiguous_14.f90 b/gcc/testsuite/gfortran.dg/contiguous_14.f90 new file mode 100644 index 000..21e42311e9c --- /dev/null +++ b/gcc/testsuite/gfortran.dg/contiguous_14.f90 @@ -0,0 +1,39 @@ +! { dg-do compile } +! PR fortran/111503 - passing NULL() to POINTER, OPTIONAL, CONTIGUOUS dummy + +program test + implicit none + integer, pointer, contiguous :: p(:) => null() + integer, allocatable, target :: a(:) + type t + integer, pointer, contiguous :: p(:) => null() + integer, allocatable :: a(:) + end type t + type(t), target :: z + class(t), allocatable, target :: c + print *, is_contiguous (p) + allocate (t :: c) + call one (p) + call one () + call one (null ()) + call one (null (p)) + call one (a) + call one (null (a)) + call one (z% p) + call one (z% a) + call one (null (z% p)) + call one (null (z% a)) + call one (c% p) + call one (c% a) + call one (null (c% p)) + call one (null (c% a)) +contains + subroutine one (x) +integer, pointer, optional, contiguous, intent(in) :: x(:) +print *, present (x) +if (present (x)) then + print *, "->", associated (x) + if (associated (x)) stop 99 +end if + end subroutine one +end -- 2.35.3
Re: [PATCH] Fortran: allow NULL() for POINTER, OPTIONAL, CONTIGUOUS dummy [PR111503]
Hi Harald, > here's another fix for the CONTIGUOUS attribute: NULL() should > derive its characteristics from its MOLD argument; otherwise it is > "determined by the entity with which the reference is associated". > (F2018:16.9.144). Looking good to me, but leave 48 hours for someone else to object if they want. Best, FX
Re: [patch] OpenMP/Fortran: Implement omp allocators/allocate for ptr/allocatables
On Wed, Nov 08, 2023 at 05:58:10PM +0100, Tobias Burnus wrote: > --- a/gcc/builtins.cc > +++ b/gcc/builtins.cc > @@ -11739,6 +11739,7 @@ builtin_fnspec (tree callee) > return ".cO "; >/* Realloc serves both as allocation point and deallocation point. */ >case BUILT_IN_REALLOC: > + case BUILT_IN_GOMP_REALLOC: > return ".Cw "; I must say I've never been sure if one needs to specify those ". " for integral arguments for which nothing is known; if they would need to be, then also all the BUILT_IN_GOMP_* other cases would be wrong, but not just those, also BUILT_IN_*_CHK (which have extra size_t argument) or here BUILT_IN_REALLOC itself. So, let's hope it is ok as is. Otherwise, the middle-end changes look just fine to me, and for Fortran FE I'm afraid you know it much more than I do. Jakub
Re: [patch] OpenMP: Add uses_allocators support
On Mon, Nov 20, 2023 at 11:42:02AM +0100, Tobias Burnus wrote: > 2023-11-19 Tobias Burnus > Chung-Lin Tang > > gcc/ChangeLog: > > * builtin-types.def (BT_FN_VOID_PTRMODE): > (BT_FN_PTRMODE_PTRMODE_INT_PTR): Add. > * gimplify.cc (gimplify_bind_expr): Diagnose missing > uses_allocators clause. > (gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses, > gimplify_omp_workshare): Handle uses_allocators. > * omp-builtins.def (BUILT_IN_OMP_INIT_ALLOCATOR, > BUILT_IN_OMP_DESTROY_ALLOCATOR): Add. > * omp-low.cc (scan_sharing_clauses): Missing description. > +static tree > +c_parser_omp_clause_uses_allocators (c_parser *parser, tree list) > +{ > + location_t clause_loc = c_parser_peek_token (parser)->location; > + tree t = NULL_TREE, nl = list; > + matching_parens parens; > + if (!parens.require_open (parser)) > +return list; > + > + tree memspace_expr = NULL_TREE; > + tree traits_var = NULL_TREE; > + > + struct item_tok > + { > +location_t loc; > +tree id; > +item_tok (void) : loc (UNKNOWN_LOCATION), id (NULL_TREE) {} > + }; > + struct item { item_tok name, arg; }; > + auto_vec *modifiers = NULL, *allocators = NULL; > + auto_vec *cur_list = new auto_vec (4); This is certainly the first time I've seen pointers to auto_vec, normally one uses just vec in such cases, auto_vec is used typically on automatic variables to make sure the destruction is done. But I think all the first parse it as a token soup without checking anything and only in the second round actually check it is something we've never done before in exactly the same situations. The usual way would be to quickly peek at tokens to see if there is : ahead and decide based on that. See e.g. c_parser_omp_clause_allocate clause. That has_modifiers check could be basically copied over with the names of modifiers changed, or could be done in a loop, or could be moved into a helper function which could be used by c_parser_omp_clause_allocate and this function and perhaps others, pass it the list of modifiers and have it return whether there are modifiers or not. c_parser_omp_clause_linear does this too (though, that has one of the modifiers without arguments). I'm afraid if parsing of every clause does the parsing so significantly differently it will be a maintainance nightmare. > @@ -23648,7 +23861,8 @@ c_parser_omp_target_exit_data (location_t loc, > c_parser *parser, > | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_IN_REDUCTION) \ > | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_THREAD_LIMIT) \ > | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_IS_DEVICE_PTR)\ > - | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_HAS_DEVICE_ADDR)) > + | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_HAS_DEVICE_ADDR)\ Space before \ please (and also in the line above. > + if (strcmp (IDENTIFIER_POINTER (DECL_NAME (t)), > + "omp_null_allocator") == 0) > + { > + error_at (OMP_CLAUSE_LOCATION (c), > + "% cannot be used in " > + "% clause"); > + break; > + } > + > + if (OMP_CLAUSE_USES_ALLOCATORS_MEMSPACE (c) > + || OMP_CLAUSE_USES_ALLOCATORS_TRAITS (c)) > + { > + error_at (OMP_CLAUSE_LOCATION (c), > + "modifiers cannot be used with pre-defined " > + "allocators"); > + break; > + } > + } > + t = OMP_CLAUSE_USES_ALLOCATORS_MEMSPACE (c); > + if (t != NULL_TREE > + && (TREE_CODE (t) != CONST_DECL > + || TREE_CODE (TREE_TYPE (t)) != ENUMERAL_TYPE > + || strcmp (IDENTIFIER_POINTER (TYPE_IDENTIFIER (TREE_TYPE > (t))), > + "omp_memspace_handle_t") != 0)) > + { > + error_at (OMP_CLAUSE_LOCATION (c), "memspace modifier must be " Maybe % ? > + "constant enum of % type"); > + remove = true; > + break; > + } > + t = OMP_CLAUSE_USES_ALLOCATORS_TRAITS (c); > + if (t != NULL_TREE) > + { > + bool type_err = false; > + > + if (TREE_CODE (TREE_TYPE (t)) != ARRAY_TYPE > + || DECL_SIZE (t) == NULL_TREE) > + type_err = true; > + else > + { > + tree elem_t = TREE_TYPE (TREE_TYPE (t)); > + if (TREE_CODE (elem_t) != RECORD_TYPE > + || strcmp (IDENTIFIER_POINTER (TYPE_IDENTIFIER (elem_t)), > + "omp_alloctrait_t") != 0 > + || !TYPE_READONLY (elem_t)) > + type_err = true; > + } > + if (type_err) > + { > + if (TREE_CODE (t) != ERROR_MARK) t != error_mark_node > + error_at (OMP_CLAUSE_LOCATION (c), "traits array %qE must " > +
Re: [Patch] OpenMP: Support acquires/release in 'omp require atomic_default_mem_order'
On Tue, Nov 28, 2023 at 12:28:05PM +0100, Tobias Burnus wrote: > I stumbled over this omission when looking at Sandra's patch. It turned out > that this is > a new OpenMP 5.2 feature - probably added to simplify/unify the syntax. I > guess the reason > that release/acquire wasn't added before is that it cannot be universally be > used - read/write > do only accept one of them. I thought when this was discussed that it was meant to behave right (choose some more appropriate memory model if one was not allowed), but reading 5.2 I think that is not what ended up in the spec, because [213:11-13] says that atomic_default_mem_order is as if the argument appeared on any atomic directive without explicit mem-order clause and atomic directive has the [314:9-10] restrictions. I'd bring this to omp-lang whether it was really meant that #pragma omp requires atomic_default_mem_order (release) int foo (int *p) { int t; #pragma omp atomic read t = *p; return t; } and #pragma omp requires atomic_default_mem_order (acquire) void bar (int *p) { #pragma omp atomic write *p = 0; } are meant to be invalid. Another comment, atomic_default_mem_order arguments aren't handled just in the requires parsing, but also in context selectors. So, the additions would need to be reflected in omp_check_context_selector and omp_context_selector_matches as well. Jakub