On Jun 4, 2015, at 5:48 AM, René Oertel <rene.oer...@cs.tu-chemnitz.de> wrote: > > Problem description: > =================== > > The critical code in question is in > opal/mca/memory/linux/memory_linux_ptmalloc2.c: > ##### > 92 #if HAVE_POSIX_MEMALIGN > 93 /* Double check for posix_memalign, too */ > 94 if (mca_memory_linux_component.memalign_invoked) { > 95 mca_memory_linux_component.memalign_invoked = false; > 96 if (0 != posix_memalign(&p, sizeof(void*), 1024 * 1024)) { > 97 return OPAL_ERR_IN_ERRNO; > 98 } > 99 free(p); > 100 } > 101 #endif > 102 > 103 if (mca_memory_linux_component.malloc_invoked && > 104 mca_memory_linux_component.realloc_invoked && > 105 mca_memory_linux_component.memalign_invoked && > 106 mca_memory_linux_component.free_invoked) { > 107 /* Happiness; our functions were invoked */ > 108 val |= OPAL_MEMORY_FREE_SUPPORT | OPAL_MEMORY_CHUNK_SUPPORT; > 109 } > [...] > 121 /* All done */ > 122 if (val > 0) { > 123 opal_mem_hooks_set_support(val); > 124 return OPAL_SUCCESS; > 125 } > ##### > > The code at lines 103-109 is legally optimized away with >=GCC-4.9 and > optimizations turned on,
I'm not sure what you mean by "legally optimized away" -- shouldn't gcc know that the call to posix_memalign() can change global variables (such as mca_memory_linux_component.<foo>)? > because line 105 could never become true with > the local knowledge of the compiler/optimizer. My compiler knowledge may be a bit rusty, but if this optimization is being made solely with local knowledge, this sounds like a buggy optimization...? > If mca_memory_linux_component.memalign_invoked == true at line 92, it > would be set to false at line 95. ...but potentially set to true again in the body of posix_memalign(). ...which you describe below. > If mca_memory_linux_component.memalign_invoked == false at line 92, it > would be false at line 103, too. > In both cases, the if at line 103-106 could never be evaluated to true > and opal_mem_hooks_set_support is never called with > OPAL_MEMORY_FREE_SUPPORT | OPAL_MEMORY_CHUNK_SUPPORT resulting in > (silently) disabled mpi_leaved_pinned. > > In the global view this local assumption is not true, because > posix_memalign() in line 96 will call the hook public_mEMALIGn() in > opal/mca/memory/linux/malloc.c which in turn sets > mca_memory_linux_component.memalign_invoked = true. > As a result, the OPAL_MEMORY_*_SUPPORT would get configured correctly in > line 123 and so the opal_mem_hooks_support_level() used by > ompi/mca/btl/openib/btl_openib_component.c and indirectly by the > ompi/mca/mpool/grdma/mpool_grdma* module enables the usage of pinned memory. How can a compiler not take the global view? Taking a local-only view is unsafe -- for exactly these kinds of reasons. > The optimization could be disabled by adding -fno-tree-partial-pre to > the CFLAGS in opal/mca/memory/linux/Makefile, but this is only a > temporary workaround. > > Patch: > ===== > > The proposed patch is as follows, which alters the point-of-view of the > compiler's optimizer on the *_invoked variables used by different code > paths (memory_linux_ptmalloc2.c vs. malloc.c): > > ##### > diff -rupN openmpi-1.8.5.org/opal/mca/memory/linux/memory_linux.h > openmpi-1.8.5/opal/mca/memory/linux/memory_linux.h > --- openmpi-1.8.5.org/opal/mca/memory/linux/memory_linux.h > 2014-10-03 22:32:23.000000000 +0200 > +++ openmpi-1.8.5/opal/mca/memory/linux/memory_linux.h 2015-06-04 > 10:01:44.941544282 +0200 > @@ -33,11 +33,11 @@ typedef struct opal_memory_linux_compone > > #if MEMORY_LINUX_PTMALLOC2 > /* Ptmalloc2-specific data */ > - bool free_invoked; > - bool malloc_invoked; > - bool realloc_invoked; > - bool memalign_invoked; > - bool munmap_invoked; > + volatile bool free_invoked; > + volatile bool malloc_invoked; > + volatile bool realloc_invoked; > + volatile bool memalign_invoked; > + volatile bool munmap_invoked; > #endif > } opal_memory_linux_component_t; I don't really want to apply this patch (thanks for filing the PR, BTW) without understanding why the compiler is not automatically taking the global view. It seems unsafe. > ##### > > Additionally, a further patch should be applied generating a warning in > the GPUDirect module if leave_pinned is not available for some reason. > In this case, GPUDirect support should be disabled, because it runs > faster without (see Case 2 below). I'll let Rolf / NVIDIA comment on the GPU implications. [snip] > Best regards, > > René "oere" Oertel Many thanks for the detailed report! -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/