I made a simple test. On 5.2/intel/mp it always worked, but intel takes care of this. Would anyone care to test it on a multicore ARM device? I don't have one available.
#include <time.h> #include <pthread.h> #include <stdio.h> #include <sys/types.h> #include <spinlock.h> static volatile int running; static volatile _spinlock_lock_t lock = _SPINLOCK_UNLOCKED; static volatile int counter; static void inc( void ) { while( _atomic_lock( &lock ) ); counter++; lock = _SPINLOCK_UNLOCKED; } static void * do_work( void * tdata ) { int local_count = 0; int * thread_data = tdata; *thread_data = 0; while( running ) { inc();local_count++; inc();local_count++; inc();local_count++; inc();local_count++; inc();local_count++; inc();local_count++; inc();local_count++; inc();local_count++; } *thread_data = local_count; return NULL; } int main() { #define NUM_THREADS 4 pthread_t tids[ NUM_THREADS ]; int counts[ NUM_THREADS ]; size_t i; time_t start_time; int sum; start_time = time(NULL); /*attempt to align on seconds-edge*/ while( time(NULL) == start_time ); start_time = time(NULL); counter = 0; running = 1; for( i = 0; i < NUM_THREADS; ++i ) pthread_create( &tids[i], NULL, do_work, &counts[i] ); sleep(5); running = 0; for( i = 0; i < NUM_THREADS; ++i ) pthread_join( tids[i], NULL ); sum = 0; for( i = 0; i < NUM_THREADS; ++i ) sum += counts[i]; printf("Thread Total:%i, Global Total:%i\n", sum, counter ); } On Thu, Aug 1, 2013 at 9:32 AM, Patrick Wildt <m...@patrick-wildt.de> wrote: > Same thoughts here. You'd have to define DMB in _atomic_lock.c as > you did for cpufunc_asm_armv7.S and use that. > > Also, I think it's the old binutils 2.15/as, not gcc. > > Am 01.08.2013 um 04:56 schrieb Artturi Alm <artturi....@gmail.com>: > >> Like i wrote earlier, i don't think it's supported, because of >> -march=armv6 and gcc version 4.2.1 20070719, unless i have missed >> something. >> >> On 08/01/13 02:28, Richard Allen wrote: >>> Could we just use GCC intrinsics in C? >>> On Jul 31, 2013 9:08 AM, "Artturi Alm" <artturi....@gmail.com> wrote: >>> >>>> On 07/31/13 16:37, Artturi Alm wrote: >>>> >>>>> On 07/31/13 08:57, Richard Allen wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I just wanted to let you know that _atomic_lock(), from _atomic_lock.c, >>>>>> as used by librthread should probably have a barrier instruction added >>>>>> to prevent the processor from reordering loads/stores around the >>>>>> atomic_lock. >>>>>> >>>>>> For more information about barriers on ARM, see: >>>>>> http://infocenter.arm.com/**help/index.jsp?topic=/com.arm.** >>>>>> doc.dui0489c/CIHGHHIE.html<http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489c/CIHGHHIE.html> >>>>>> >>>>>> >>>>>> For some examples, see sections 7.2.1 and 7.2.2 >>>>>> http://infocenter.arm.com/**help/topic/com.arm.doc.** >>>>>> genc007826/Barrier_Litmus_**Tests_and_Cookbook_A08.pdf<http://infocenter.arm.com/help/topic/com.arm.doc.genc007826/Barrier_Litmus_Tests_and_Cookbook_A08.pdf> >>>>>> >>>>>> >>>>>> -Richard >>>>>> >>>>>> >>>>> Hi, >>>>> >>>>> I'm not sure what the diff against _atomic_lock.c would look like, >>>>> since i'm guessing it might not be supported, and i don't like >>>>> inline asm so i'll leave it to someone else, however, diff for >>>>> using it in cpufunc_asm_armv7.S would be something like below >>>>> >>>>> >>>> Wow, that was unclear. what i meant with 'it' was of course the dmb >>>> instruction, and fwiw freebsd uses it exclusively in atomic.h inlines. >>>> >>>> >>>>> -Artturi >>>>> >>>>> >>>>> >>>>> Index: cpufunc_asm_armv7.S >>>>> ==============================**==============================**======= >>>>> RCS file: /cvs/src/sys/arch/arm/arm/**cpufunc_asm_armv7.S,v >>>>> retrieving revision 1.6 >>>>> diff -u -p -r1.6 cpufunc_asm_armv7.S >>>>> --- cpufunc_asm_armv7.S 30 Mar 2013 01:30:30 -0000 1.6 >>>>> +++ cpufunc_asm_armv7.S 31 Jul 2013 13:26:18 -0000 >>>>> @@ -19,6 +19,7 @@ >>>>> #include <machine/asm.h> >>>>> >>>>> #define DSB .long 0xf57ff040 >>>>> +#define DMB .long 0xf57ff050 >>>>> #define ISB .long 0xf57ff060 >>>>> #define WFI .long 0xe320f003