On 05/11/2016 09:59 AM, Ilya Verbin wrote:
On Wed, May 11, 2016 at 10:47:49 +0100, Ramana Radhakrishnan wrote:
I've looked at the generated code in more details, and for armv6 this generates
mcr p15, 0, r0, c7, c10, 5
which is not what __cilkrts_fence uses currently (CP15DSB vs CP15DMB)
Wow I hadn't noticed that it was a DSB - DSB is way too heavy weight. Userland
shouldn't need to use this by default IMNSHO. It's needed if you are working on
non-cacheable memory or performing cache maintenance operations but I can't
imagine cilkplus wanting to do that !
http://infocenter.arm.com/help/topic/com.arm.doc.genc007826/Barrier_Litmus_Tests_and_Cookbook_A08.pdf
It's almost like the default definitions need to be in terms of the atomic
extensions rather than having these written in this form. Folks usually get
this wrong !
Looking at arm/sync.md it seems that there is no way to generate CP15DSB.
No - there is no way of generating DSB, DMB's should be sufficient for this
purpose. Would anyone know what the semantics of __cilkrts_fence are that
require this to be a DSB ?
__cilkrts_fence semantics is identical to __sync_synchronize, so DMB look OK.
Maybe we should just define:
#define __cilkrts_fence() __sync_synchronize()
Certainly seems like indirecting through a compiler builtin rather than
using an ASM would be advisable.
jeff