Author: marius
Date: Sat Oct  1 00:11:03 2011
New Revision: 225889
URL: http://svn.freebsd.org/changeset/base/225889

Log:
  In total store which we use for running the kernel and all of the userland
  atomic operations behave as if the were followed by a memory barrier so
  there's no need to include ones in the acquire variants of atomic(9).
  Removing these results a small performance improvement, specifically this
  is sufficient to compensate the performance loss seen in the worldstone
  benchmark seen when using SCHED_ULE instead of SCHED_4BSD.
  This change is inspired by Linux even more radically doing the equivalent
  thing some time ago.
  Thanks go to Peter Jeremy for additional testing.

Modified:
  head/sys/sparc64/include/atomic.h

Modified: head/sys/sparc64/include/atomic.h
==============================================================================
--- head/sys/sparc64/include/atomic.h   Fri Sep 30 20:23:18 2011        
(r225888)
+++ head/sys/sparc64/include/atomic.h   Sat Oct  1 00:11:03 2011        
(r225889)
@@ -74,12 +74,16 @@
  *
  * the return value of cas is used to avoid the extra reload.
  *
- * The memory barriers provided by the acq and rel variants are intended
- * to be sufficient for use of relaxed memory ordering.  Due to the
- * suggested assembly syntax of the membar operands containing a #
- * character, they cannot be used in macros.  The cmask and mmask bits
+ * We only include a memory barrier in the rel variants as in total store
+ * order which we use for running the kernel and all of the userland atomic
+ * loads and stores behave as if the were followed by a membar with a mask
+ * of #LoadLoad | #LoadStore | #StoreStore.  In order to be also sufficient
+ * for use of relaxed memory ordering, the atomic_cas() in the acq variants 
+ * additionally would have to be followed by a membar #LoadLoad | #LoadStore.
+ * Due to the suggested assembly syntax of the membar operands containing a
+ * # character, they cannot be used in macros.  The cmask and mmask bits thus
  * are hard coded in machine/cpufunc.h and used here through macros.
- * Hopefully sun will choose not to change the bit numbers.
+ * Hopefully the bit numbers won't change in the future.
  */
 
 #define        itype(sz)       uint ## sz ## _t
@@ -93,7 +97,6 @@
 #define        atomic_cas_acq(p, e, s, sz) ({                                  
\
        itype(sz) v;                                                    \
        v = atomic_cas(p, e, s, sz);                                    \
-       membar(LoadLoad | LoadStore);                                   \
        v;                                                              \
 })
 
@@ -118,7 +121,6 @@
 #define        atomic_op_acq(p, op, v, sz) ({                                  
\
        itype(sz) t;                                                    \
        t = atomic_op(p, op, v, sz);                                    \
-       membar(LoadLoad | LoadStore);                                   \
        t;                                                              \
 })
 
@@ -135,7 +137,6 @@
 #define        atomic_load_acq(p, sz) ({                                       
\
        itype(sz) v;                                                    \
        v = atomic_load(p, sz);                                         \
-       membar(LoadLoad | LoadStore);                                   \
        v;                                                              \
 })
 
_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to