Hi, Thanks for the additional details. I was able to reproduce the bug, and I believe I now see the problem.
'atomic-box-compare-and-swap!' is implemented using 'atomic_compare_exchange_weak' (if available), but neglects to handle the case where 'atomic_compare_exchange_weak' may spuriously fail. In that case, the box is left unchanged, although the observed value was equal to the expected value. What's happening here is that the 'atomic-box-compare-and-swap!' in 'sleep-loop' fails spuriously, leaving the box in state #:accepted although it returns #:accepted as its result. When the main loop discovers this, it changes the state to #:need-to-sleep, although the thread has already ended. To confirm this hypothesis, I added a print statement to the main loop showing the state of the box that it observed during the protocol exchange, and indeed it sees #:accepted the first time it checks, and #:need-to-sleep in all later iterations. I've attached a proposed patch that I hope will fix this problem. If you'd be willing to test it, I'd be grateful, but otherwise I'll try to test it in the next week or so. My access to armv7l boxes is somewhat limited. Thanks for this report. Mark
>From 958d37686a9ac65f48cb2ca32a341cf182c24b5a Mon Sep 17 00:00:00 2001 From: Mark H Weaver <m...@netris.org> Date: Thu, 27 Sep 2018 01:00:11 -0400 Subject: [PATCH] UNTESTED: Fix 'atomic-box-compare-and-swap!'. Fixes <https://bugs.gnu.org/32786>. 'scm_atomic_compare_and_swap_scm' uses 'atomic_compare_exchange_weak' if it's available on the host platform. 'atomic_compare_exchange_weak' may fail spuriously, leaving the atomic object unchanged even when it contained the expected value. 'atomic-box-compare-and-swap!' uses 'scm_atomic_compare_and_swap_scm' without checking its return value, and therefore may return the expected value while leaving the box unchanged. This contradicts the documentation, which explicitly states that "you can know if the swap worked by checking if the return value is 'eq?' to EXPECTED.". * libguile/atomic.c (scm_atomic_box_compare_and_swap_x): If 'scm_atomic_compare_and_swap_scm' returns false and the observed value is equal to 'expected', then try again. * libguile/vm-engine.c (atomic-box-compare-and-swap!): Ditto. --- libguile/atomic.c | 13 +++++++++---- libguile/vm-engine.c | 13 ++++++++----- 2 files changed, 17 insertions(+), 9 deletions(-) diff --git a/libguile/atomic.c b/libguile/atomic.c index 950874030..504643912 100644 --- a/libguile/atomic.c +++ b/libguile/atomic.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2016 Free Software Foundation, Inc. +/* Copyright (C) 2016, 2018 Free Software Foundation, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public License @@ -95,10 +95,15 @@ SCM_DEFINE (scm_atomic_box_compare_and_swap_x, "if the return value is @code{eq?} to @var{expected}.") #define FUNC_NAME s_scm_atomic_box_compare_and_swap_x { + SCM result = expected; + SCM_VALIDATE_ATOMIC_BOX (1, box); - scm_atomic_compare_and_swap_scm (scm_atomic_box_loc (box), - &expected, desired); - return expected; + while (!scm_atomic_compare_and_swap_scm (scm_atomic_box_loc (box), + &result, desired) + && scm_is_eq (result, expected)) + { + } + return result; } #undef FUNC_NAME diff --git a/libguile/vm-engine.c b/libguile/vm-engine.c index 19ff3e498..9650e33eb 100644 --- a/libguile/vm-engine.c +++ b/libguile/vm-engine.c @@ -3868,16 +3868,19 @@ VM_NAME (scm_i_thread *thread, struct scm_vm *vp, { scm_t_uint16 dst, box; scm_t_uint32 expected, desired; - SCM scm_box, scm_expected; + SCM scm_box, scm_expected, scm_result; UNPACK_12_12 (op, dst, box); UNPACK_24 (ip[1], expected); UNPACK_24 (ip[2], desired); scm_box = SP_REF (box); VM_VALIDATE_ATOMIC_BOX (scm_box, "atomic-box-compare-and-swap!"); - scm_expected = SP_REF (expected); - scm_atomic_compare_and_swap_scm (scm_atomic_box_loc (scm_box), - &scm_expected, SP_REF (desired)); - SP_SET (dst, scm_expected); + scm_result = scm_expected = SP_REF (expected); + while (!scm_atomic_compare_and_swap_scm (scm_atomic_box_loc (scm_box), + &scm_result, SP_REF (desired)) + && scm_is_eq (scm_result, scm_expected)) + { + } + SP_SET (dst, scm_result); NEXT (3); } -- 2.19.0