http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48126
Dr. David Alan Gilbert <david.gilbert at linaro dot org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |david.gilbert at linaro dot | |org --- Comment #5 from Dr. David Alan Gilbert <david.gilbert at linaro dot org> 2011-06-22 16:40:07 UTC --- Michael: I think I agree with you on the need for the barrier in the branch out case; gcc's info page (section 6.49 'Built-in functions for atomic memory access') state: ----- In most cases, these builtins are considered a "full barrier". That is, no memory operand will be moved across the operation, either forward or backward. Further, instructions will be issued as necessary to prevent the processor from speculating loads across the operation and from queuing stores after the operation. ------ so it does look like that last barrier would be needed to stop a subsequent load floating backwards before the ldrex. If I understand correctly however most cases wouldn't need it - I think most cases are use the compare&swap to take some form of lock, and then once you know you have the lock go and do your accesses - and in that case the ordering is guaranteed, where as if you couldn't take the lock you wouldn't use the subsequent access anyway. Dave