http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48126
--- Comment #3 from Marcus Shawcroft <marcus.shawcroft at arm dot com> 2011-05-24 13:37:03 UTC --- The primitive is required to have lock semantics therefore the load of the old value must be followed by a dmb in the case that the old value comparison succeeds and the swap goes ahead. In the branch out case the final dmb serves no purpose, the swap did not occur and no lock was taken. Therefore the branch over dmb is ok.