The branch main has been updated by markj:

URL: 
https://cgit.FreeBSD.org/src/commit/?id=8694fd333556addb97acfff1feca6a1e389201ce

commit 8694fd333556addb97acfff1feca6a1e389201ce
Author:     Mark Johnston <ma...@freebsd.org>
AuthorDate: 2022-09-24 13:18:04 +0000
Commit:     Mark Johnston <ma...@freebsd.org>
CommitDate: 2022-09-24 13:18:04 +0000

    smr: Fix synchronization in smr_enter()
    
    smr_enter() must publish its observed read sequence number before
    issuing any subsequent memory operations.  The ordering provided by
    atomic_add_acq_int() is insufficient on some platforms, at least on
    arm64, because it permits reordering of subsequent loads with the store
    to c_seq.
    
    Thus, use atomic_thread_fence_seq_cst() to issue a store-load barrier
    after publishing the read sequence number.
    
    On x86, take advantage of the fact that memory operations are not
    reordered with locked instructions to improve code density: we can store
    the observed read sequence and provide a store-load barrier with a
    single operation.
    
    Based on a patch from Pierre Habouzit <pie...@habouzit.net>.
    
    PR:             265974
    Reviewed by:    alc
    MFC after:      2 weeks
    Differential Revision:  https://reviews.freebsd.org/D36370
---
 sys/sys/smr.h | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/sys/sys/smr.h b/sys/sys/smr.h
index c110be9a66c2..1319e2bf465b 100644
--- a/sys/sys/smr.h
+++ b/sys/sys/smr.h
@@ -122,8 +122,12 @@ smr_enter(smr_t smr)
         * Frees that are newer than this stored value will be
         * deferred until we call smr_exit().
         *
-        * An acquire barrier is used to synchronize with smr_exit()
-        * and smr_poll().
+        * Subsequent loads must not be re-ordered with the store.  On
+        * x86 platforms, any locked instruction will provide this
+        * guarantee, so as an optimization we use a single operation to
+        * both store the cached write sequence number and provide the
+        * requisite barrier, taking advantage of the fact that
+        * SMR_SEQ_INVALID is zero.
         *
         * It is possible that a long delay between loading the wr_seq
         * and storing the c_seq could create a situation where the
@@ -132,8 +136,12 @@ smr_enter(smr_t smr)
         * the load.  See smr_poll() for details on how this condition
         * is detected and handled there.
         */
-       /* This is an add because we do not have atomic_store_acq_int */
+#if defined(__amd64__) || defined(__i386__)
        atomic_add_acq_int(&smr->c_seq, smr_shared_current(smr->c_shared));
+#else
+       atomic_store_int(&smr->c_seq, smr_shared_current(smr->c_shared));
+       atomic_thread_fence_seq_cst();
+#endif
 }
 
 /*

Reply via email to