On 19 April 2023 09:06:28 CEST, Lipeng Zhu via Fortran <fort...@gcc.gnu.org> wrote: >This patch try to introduce the rwlock and split the read/write to >unit_root tree and unit_cache with rwlock instead of the mutex to >increase CPU efficiency. In the get_gfc_unit function, the percentage >to step into the insert_unit function is around 30%, in most instances, >we can get the unit in the phase of reading the unit_cache or unit_root >tree. So split the read/write phase by rwlock would be an approach to >make it more parallel. > >BTW, the IPC metrics can gain around 9x in our test >server with 220 cores. The benchmark we used is >https://github.com/rwesson/NEAT >
>+#define RD_TO_WRLOCK(rwlock) \ >+ RWUNLOCK (rwlock);\ >+ WRLOCK (rwlock); >+#endif >+ >diff --git a/libgfortran/io/unit.c b/libgfortran/io/unit.c >index 82664dc5f98..4312c5f36de 100644 >--- a/libgfortran/io/unit.c >+++ b/libgfortran/io/unit.c >@@ -329,7 +335,7 @@ get_gfc_unit (int n, int do_create) > int c, created = 0; > > NOTE ("Unit n=%d, do_create = %d", n, do_create); >- LOCK (&unit_lock); >+ RDLOCK (&unit_rwlock); > > retry: > for (c = 0; c < CACHE_SIZE; c++) >@@ -350,6 +356,7 @@ retry: > if (c == 0) > break; > } >+ RD_TO_WRLOCK (&unit_rwlock); So I'm trying to convince myself why it's safe to unlock and only then take the write lock. Can you please elaborate/confirm why that's ok? I wouldn't mind a comment like We can release the unit and cache read lock now. We might have to allocate a (locked) unit, below in do_create. Either way, we will manipulate the cache and/or the unit list so we have to take a write lock now. We don't take the write bit in *addition* to the read lock because.. (that needlessly complicates releasing the respective locks / it triggers too much contention when we.. / ...?) thanks, > > if (p == NULL && do_create) > { >@@ -368,8 +375,8 @@ retry: > if (created) > { > /* Newly created units have their lock held already >- from insert_unit. Just unlock UNIT_LOCK and return. */ >- UNLOCK (&unit_lock); >+ from insert_unit. Just unlock UNIT_RWLOCK and return. */ >+ RWUNLOCK (&unit_rwlock); > return p; > } > >@@ -380,7 +387,7 @@ found: > if (! TRYLOCK (&p->lock)) > { > /* assert (p->closed == 0); */ >- UNLOCK (&unit_lock); >+ RWUNLOCK (&unit_rwlock); > return p; > } > >@@ -388,14 +395,14 @@ found: > } > > >- UNLOCK (&unit_lock); >+ RWUNLOCK (&unit_rwlock); > > if (p != NULL && (p->child_dtio == 0)) > { > LOCK (&p->lock); > if (p->closed) > { >- LOCK (&unit_lock); >+ WRLOCK (&unit_rwlock); > UNLOCK (&p->lock); > if (predec_waiting_locked (p) == 0) > destroy_unit_mutex (p); >@@ -593,8 +600,8 @@ init_units (void) > #endif > #endif > >-#ifndef __GTHREAD_MUTEX_INIT >- __GTHREAD_MUTEX_INIT_FUNCTION (&unit_lock); >+#if (!defined(__GTHREAD_RWLOCK_INIT) && !defined(__GTHREAD_MUTEX_INIT)) >+ __GTHREAD_MUTEX_INIT_FUNCTION (&unit_rwlock); > #endif > > if (sizeof (max_offset) == 8)