Based on the definition of a pure function, two calls to the same pure function with the same arguments should return the same value. This is obviously never true for posix_memalign.
I suggest we take a look at the generated code, and based on that we decide how to move forward on this. George. > On Jun 4, 2015, at 22:47 , Gilles Gouaillardet <gil...@rist.or.jp> wrote: > > Jeff, > > imho, this is a grey area ... > > 99.999% of the time, posix_memalign is a "pure" function. > "pure" means it has no side effects. > > unfortunatly, this part of the code is the 0.001% case in which we explicitly > rely on a side effect > (e.g. posix_memalign calls an Open MPI wrapper that updates a global variable) > > i am not surprised (nor shocked) the compiler assumes posix_memalign is side > effect free when (aggressive) optimisation is on, since it is a valid > asumption most of the time. > > Of course, the is a chance that might be a bug > (e.g. GCC folks forgot a wrapper can be added to posix_memalign, so it is not > 100% safe to assume posix_memalign is side effect free) > but as far as i am concerned, using "volatile" is correct. > > Cheers, > > Gilles > > > > > On 6/5/2015 10:26 AM, Jeff Squyres (jsquyres) wrote: >> On Jun 4, 2015, at 5:48 AM, René Oertel <rene.oer...@cs.tu-chemnitz.de> >> wrote: >>> Problem description: >>> =================== >>> >>> The critical code in question is in >>> opal/mca/memory/linux/memory_linux_ptmalloc2.c: >>> ##### >>> 92 #if HAVE_POSIX_MEMALIGN >>> 93 /* Double check for posix_memalign, too */ >>> 94 if (mca_memory_linux_component.memalign_invoked) { >>> 95 mca_memory_linux_component.memalign_invoked = false; >>> 96 if (0 != posix_memalign(&p, sizeof(void*), 1024 * 1024)) { >>> 97 return OPAL_ERR_IN_ERRNO; >>> 98 } >>> 99 free(p); >>> 100 } >>> 101 #endif >>> 102 >>> 103 if (mca_memory_linux_component.malloc_invoked && >>> 104 mca_memory_linux_component.realloc_invoked && >>> 105 mca_memory_linux_component.memalign_invoked && >>> 106 mca_memory_linux_component.free_invoked) { >>> 107 /* Happiness; our functions were invoked */ >>> 108 val |= OPAL_MEMORY_FREE_SUPPORT | OPAL_MEMORY_CHUNK_SUPPORT; >>> 109 } >>> [...] >>> 121 /* All done */ >>> 122 if (val > 0) { >>> 123 opal_mem_hooks_set_support(val); >>> 124 return OPAL_SUCCESS; >>> 125 } >>> ##### >>> >>> The code at lines 103-109 is legally optimized away with >=GCC-4.9 and >>> optimizations turned on, >> I'm not sure what you mean by "legally optimized away" -- shouldn't gcc know >> that the call to posix_memalign() can change global variables (such as >> mca_memory_linux_component.<foo>)? >> >>> because line 105 could never become true with >>> the local knowledge of the compiler/optimizer. >> My compiler knowledge may be a bit rusty, but if this optimization is being >> made solely with local knowledge, this sounds like a buggy optimization...? >> >>> If mca_memory_linux_component.memalign_invoked == true at line 92, it >>> would be set to false at line 95. >> ...but potentially set to true again in the body of posix_memalign(). >> ...which you describe below. >> >>> If mca_memory_linux_component.memalign_invoked == false at line 92, it >>> would be false at line 103, too. >>> In both cases, the if at line 103-106 could never be evaluated to true >>> and opal_mem_hooks_set_support is never called with >>> OPAL_MEMORY_FREE_SUPPORT | OPAL_MEMORY_CHUNK_SUPPORT resulting in >>> (silently) disabled mpi_leaved_pinned. >>> >>> In the global view this local assumption is not true, because >>> posix_memalign() in line 96 will call the hook public_mEMALIGn() in >>> opal/mca/memory/linux/malloc.c which in turn sets >>> mca_memory_linux_component.memalign_invoked = true. >>> As a result, the OPAL_MEMORY_*_SUPPORT would get configured correctly in >>> line 123 and so the opal_mem_hooks_support_level() used by >>> ompi/mca/btl/openib/btl_openib_component.c and indirectly by the >>> ompi/mca/mpool/grdma/mpool_grdma* module enables the usage of pinned memory. >> How can a compiler not take the global view? Taking a local-only view is >> unsafe -- for exactly these kinds of reasons. >> >>> The optimization could be disabled by adding -fno-tree-partial-pre to >>> the CFLAGS in opal/mca/memory/linux/Makefile, but this is only a >>> temporary workaround. >>> >>> Patch: >>> ===== >>> >>> The proposed patch is as follows, which alters the point-of-view of the >>> compiler's optimizer on the *_invoked variables used by different code >>> paths (memory_linux_ptmalloc2.c vs. malloc.c): >>> >>> ##### >>> diff -rupN openmpi-1.8.5.org/opal/mca/memory/linux/memory_linux.h >>> openmpi-1.8.5/opal/mca/memory/linux/memory_linux.h >>> --- openmpi-1.8.5.org/opal/mca/memory/linux/memory_linux.h >>> 2014-10-03 22:32:23.000000000 +0200 >>> +++ openmpi-1.8.5/opal/mca/memory/linux/memory_linux.h 2015-06-04 >>> 10:01:44.941544282 +0200 >>> @@ -33,11 +33,11 @@ typedef struct opal_memory_linux_compone >>> >>> #if MEMORY_LINUX_PTMALLOC2 >>> /* Ptmalloc2-specific data */ >>> - bool free_invoked; >>> - bool malloc_invoked; >>> - bool realloc_invoked; >>> - bool memalign_invoked; >>> - bool munmap_invoked; >>> + volatile bool free_invoked; >>> + volatile bool malloc_invoked; >>> + volatile bool realloc_invoked; >>> + volatile bool memalign_invoked; >>> + volatile bool munmap_invoked; >>> #endif >>> } opal_memory_linux_component_t; >> I don't really want to apply this patch (thanks for filing the PR, BTW) >> without understanding why the compiler is not automatically taking the >> global view. It seems unsafe. >> >>> ##### >>> >>> Additionally, a further patch should be applied generating a warning in >>> the GPUDirect module if leave_pinned is not available for some reason. >>> In this case, GPUDirect support should be disabled, because it runs >>> faster without (see Case 2 below). >> I'll let Rolf / NVIDIA comment on the GPU implications. >> >> [snip] >> >>> Best regards, >>> >>> René "oere" Oertel >> Many thanks for the detailed report! >> > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/06/27042.php