On 05/11/2016 09:59 AM, Ilya Verbin wrote:
On Wed, May 11, 2016 at 10:47:49 +0100, Ramana Radhakrishnan wrote:

I've looked at the generated code in more details, and for armv6 this generates
mcr     p15, 0, r0, c7, c10, 5
which is not what __cilkrts_fence uses currently (CP15DSB vs CP15DMB)

Wow I hadn't noticed that it was a DSB -  DSB is way too heavy weight. Userland 
shouldn't need to use this by default IMNSHO. It's needed if you are working on 
non-cacheable memory or performing cache maintenance operations but I can't 
imagine cilkplus wanting to do that !

http://infocenter.arm.com/help/topic/com.arm.doc.genc007826/Barrier_Litmus_Tests_and_Cookbook_A08.pdf

It's almost like the default definitions need to be in terms of the atomic 
extensions rather than having these written in this form. Folks usually get 
this wrong !

Looking at arm/sync.md it seems that there is no way to generate CP15DSB.

No - there is no way of generating DSB,  DMB's should be sufficient for this 
purpose. Would anyone know what the semantics of __cilkrts_fence are that 
require this to be a DSB ?

__cilkrts_fence semantics is identical to __sync_synchronize, so DMB look OK.

Maybe we should just define:
  #define __cilkrts_fence() __sync_synchronize()
Certainly seems like indirecting through a compiler builtin rather than using an ASM would be advisable.

jeff

Reply via email to