On 08/12/15 09:25, Tobias Burnus wrote:
On Mon, Dec 07, 2015 at 02:09:22PM +0000, Matthew Wahab wrote:
I wonder whether using
__asm__ __volatile__ ("":::"memory");
would be sufficient as it has a way lower overhead than
__sync_synchronize().
I don't know anything about Fortran or coarrays and I'm curious
whether this affects architectures with weak memory models. Is the
barrier only needed to stop reordering by the compiler or is does it
also need to stop reordering by the hardware?
Short answer: I think no mfence is needed as either the communication
is local (to the thread/process) - in which case the hardware will act
correctly - or the communication is remote (different thread, process,
communication to different computer via interlink [ethernet, infiniband,
...]); and in the later case, the communication library has to deal with
it.
Thanks for explaining this, it made things clear. Based on your description, I agree
that hardware reordering shouldn't be a problem.
and the (main) program code (slightly trimmed):
static void * restrict caf_token.0;
static integer(kind=4) * restrict var;
void _caf_init.1 (void);
*var = 4;
desc.3.data = 42;
_gfortran_caf_send (caf_token.0, 0B /* offset */ var,
_gfortran_caf_this_image (0), &desc.2, 0B, &desc.3, 4,
4, 0);
__asm__ __volatile__("":::"memory"); // new
tmp = *var;
The problem is that in that case the compiler does not know that
"_gfortran_caf_send (caf_token.0," can modify "*var".
Is the restrict attribute on var correct? From what you say, it sounds like *var
could be accessed through other pointers (assuming restrict has the same meaning as
in C).
Matthew