Hello Berny,

On Monday, January 2, 2017 8:02:03 PM CET Bernhard Voelker wrote:
> On 01/02/2017 05:37 PM, Pavel Raiskup wrote:
> > On Monday, January 2, 2017 4:50:28 PM CET Bruno Haible wrote:
> >> Especially since the problem occurs only on one architecture.
> > 
> > I've been able to reproduce this on i686 in the meantime too, sorry -- I 
> > just
> > reported what I observed :(.  See [1].
> 
> ... or it is related to the KOJI environment?

Maybe, I was able to reproduce this on x86_64 VM, running the build (and
tests) in mock i386 _chroot_ (koji system "cross-compiles" packages in
i386 chroot).

So finally I was able to attach strace and gdb the process (on 8-core
machine), and gdb just confirmed that:

  - the readers wait for main process (busy loop)
  - main process waits for all writers to finish (thread join)
  - writers wait indefinitely for the rwlock released by all readers

> I've seen some of the gnulib tests in the coreutils-testsuite failing on
> several non-x86 archs on the openSUSE build service in the past
> (especially on newer like aarch64).  But at least in the past year, the
> tests on all of i586, x86_64, ppc, ppc64, ppc64le, aarch64 and armv7l
> have been quite stable.

Seems to be different issue.

I am able to reproduce very non-deterministically, under weird conditions.
But to make it a bit more deterministic -- please try the attached patch
which:

  - prolongs the critical section in reader thread
  - to be a bit more fair, the number of concurrent threads is decreased
    to 3 readers (and 3 writers too), so the thread queue is not too long

After at most several iterations, there's only:

  ...
  Checker 0x7fc9586b6700 after  check unlock
  Checker 0x7fc9586b6700 before check rdlock
  Checker 0x7fc957eb5700 after  check unlock
  Checker 0x7fc957eb5700 before check rdlock
  ....

At least on my box ... is it the same for you?  If yes, is there some
mistake in the patch?  Because otherwise that would just prove that we are
testing behavior which is not guaranteed to happen;  IOW we can't
guarantee that the critical sections _don't_ take always the same (long
enough) time period so there's always one reader with acquired lock.

I am afraid about the explic yield, which doesn't help because (probably?)
PTHREAD_RWLOCK_PREFER_READER_NP is the default?  Dunno.

Pavel
diff --git a/tests/test-lock.c b/tests/test-lock.c
index 095511e..336daeb 100644
--- a/tests/test-lock.c
+++ b/tests/test-lock.c
@@ -17,6 +17,7 @@
 /* Written by Bruno Haible <br...@clisp.org>, 2005.  */
 
 #include <config.h>
+#include <unistd.h>
 
 #if USE_POSIX_THREADS || USE_SOLARIS_THREADS || USE_PTH_THREADS || USE_WINDOWS_THREADS
 
@@ -41,10 +42,10 @@
 /* Which tests to perform.
    Uncomment some of these, to verify that all tests crash if no locking
    is enabled.  */
-#define DO_TEST_LOCK 1
+#define DO_TEST_LOCK 0
 #define DO_TEST_RWLOCK 1
-#define DO_TEST_RECURSIVE_LOCK 1
-#define DO_TEST_ONCE 1
+#define DO_TEST_RECURSIVE_LOCK 0
+#define DO_TEST_ONCE 0
 
 /* Whether to help the scheduler through explicit yield().
    Uncomment this to see if the operating system has a fair scheduler.  */
@@ -58,10 +59,10 @@
 #define USE_VOLATILE 0
 
 /* Whether to print debugging messages.  */
-#define ENABLE_DEBUGGING 0
+#define ENABLE_DEBUGGING 1
 
 /* Number of simultaneous threads.  */
-#define THREAD_COUNT 10
+#define THREAD_COUNT 3
 
 /* Number of operations performed in each thread.
    This is quite high, because with a smaller count, say 5000, we often get
@@ -312,11 +313,20 @@ static struct atomic_int rwlock_checker_done;
 static void *
 rwlock_checker_thread (void *arg)
 {
+  int tid = *((int *)arg);
   while (get_atomic_int_value (&rwlock_checker_done) == 0)
     {
       dbgprintf ("Checker %p before check rdlock\n", gl_thread_self_pointer ());
       gl_rwlock_rdlock (my_rwlock);
       check_accounts ();
+
+      /* Wait about (THREAD_COUNT - 1) seconds in critical section, so (if
+       * writers are not prefered) at least one reader has ackquired rdonly
+       * lock and writers starve. */
+      while (time(NULL) % THREAD_COUNT != tid)
+        usleep(100000);
+      usleep(800000);
+
       gl_rwlock_unlock (my_rwlock);
       dbgprintf ("Checker %p after  check unlock\n", gl_thread_self_pointer ());
 
@@ -342,7 +352,7 @@ test_rwlock (void)
 
   /* Spawn the threads.  */
   for (i = 0; i < THREAD_COUNT; i++)
-    checkerthreads[i] = gl_thread_create (rwlock_checker_thread, NULL);
+    checkerthreads[i] = gl_thread_create (rwlock_checker_thread, &i);
   for (i = 0; i < THREAD_COUNT; i++)
     threads[i] = gl_thread_create (rwlock_mutator_thread, NULL);
 

Reply via email to